Here is my first effort to explain, in basic terms, the concept of a “module”. I tried to make it accessible to those not studying pure math, while still remaining interesting to those who are. I have no idea how well this can actually work in practice (or if it even *did* work); hopefully people will just skip over any terminology they don’t understand instead of giving up on the whole article. Better yet, ask me questions in the comments!

In linear algebra courses, we learn about *vector spaces*: these are algebraic structures where you can add two vectors (), or scale a vector by some amount (). In applied treatments, these “scalars” are usually tacitly assumed to come from the real or complex numbers ( or ); it is seldom mentioned that the “scalar multiplication” of a vector space is really an *action* of a *field* on an *abelian group* .

If you’re puzzled by this last phrase, don’t worry. The word *action* merely indicates a rule for assigning to each element of a transformation (indeed, a group homomorphism) of , in a particularly nice way: to each we associate the “multiplication by ” map given by the rule

.

To indicate the field of scalars explicitly, we might call an **-vector space** or a **vector space over **. To summarize, an -vector space is a set , equipped with a rule for *adding* two elements of , as well as a rule for *scaling* elements of “by” elements of .

A **ring** is, loosely speaking, a structure in which we can add and multiply elements, which satisfies most of the usual arithmetic laws. Fields are a very special kind of ring: the multiplication of a field is **commutative** (), and every nonzero element has a **reciprocal** . Since they have so many wonderful properties, they are much more “rigid”: on an elementary level, their nature is far less complicated than that of rings in general (don’t get me wrong, there’s still a lot we don’t know about fields). However, there are rings we deal with every day which (for simple reasons e.g. the lack of reciprocals) are *not* fields, for example, the ring of integers:

Something I’ve been meaning to learn more about for a while are “vector spaces where the scalars are allowed to come from *any ring at all, not necessarily a field*“. In mathematics we call these **-modules**, or **modules over ** (or just **modules** when the ring of scalars is clear). Hence, a vector space is just a module over a field. They’re immensely useful in the study of rings themselves, and most people usually glimpse them for the first time in a course on commutative algebra (perhaps when they begin as a grad student). Unfortunately, it will be another 8 months before I have a chance to take an actual course in commutative algebra (PMATH 446), and I’m too impatient for that.

One thing we notice about vector spaces is that their structure theory is trivial; it’s about as nice as it could possibly be. Namely, -vector spaces are in some sense “completely determined” by their dimension. You’re probably familiar (at least in the case where the dimension is finite) with the fact that, by choosing a basis, any such space can be viewed simply as .

Intriguingly, when we move from the setting of vector spaces to the more broad world of modules, the “more complicated” personae of general rings (compared to fields) mangles the situation significantly. A lot of our linear algebra, which we were able to develop with elementary methods, is vehemently defenestrated. In particular, it is no longer even true that a basis always exists for a module (in fact this is a pretty rare situation, and such modules are called free). This means our much-applauded concept of dimension doesn’t, in general, even *make sense* for modules. Nor is the “completely decomposable” nature of vector spaces shared by modules: it’s actually possible to construct *huge* modules which don’t even have a *single* “submodule” (other than the obvious ones and ).

For our very first example of some of the ideas modules generalize, let’s talk about **abelian groups**: these are simply sets equipped with an associative, commutative binary operation , an identity element, and inverses. Given any abelian group, we can define a rule for “scaling” elements of by integers, that is, elements of . Namely, for we define ( times), simply using the group operation , and otherwise if we define (that is, with recourse to the case). This is a perfectly natural way of turning any abelian group into a -module. On the other hand, it is obvious that if you start with a -module you can just forget about the scalar multiplication altogether, and you’re left with an abelian group. So *-modules are the same thing as abelian groups*.

If you think back, you’ll probably recall that a large part of linear algebra had to do with linear operators (more concretely, matrices) and doing things with them, like finding their eigenvalues and eigenvectors, characteristic polynomials, determinants, traces, and so on. A lot of work was put into discussing when “diagonalisation” is possible, and how to achieve it. Since we’re not always lucky enough to be able to do this, you probably learned about **canonical forms**: the “next best thing” to diagonalisation where we usually try to get some kind of “block diagonal” form. Namely, Jordan canonical form, rational canonical form, and all that. So why should we even care about modules? Furthermore, why were linear operators (square matrices) so much subtler objects to deal with than vector spaces themselves?

The following cool idea provides what I believe is an epistemologically satisfactory answer. First, recall that the set of all polynomials with coefficients in forms a ring under the usual operations of addition and multiplication, known as the **polynomial ring** . Suppose is an -vector space. I claim that a linear operator is the same thing as an -module structure on . Notice that is already an -vector space, and any action must respect the ring structure of , so *my previous claim reduces merely to saying that a linear operator is the same as a rule for scaling an element by the element . *What is the obvious thing to do? Well, simply define for all , right? Then right away, this gives us a scalar multiplication of on : namely,

Here, of course, refers to the *composition* of with itself times, that is, the map . For this reason many authors will refer to this as an -module structure on , since the indeterminate is literally acting as . Okay, so every linear map gives rise to an -module structure on . What about the other way? That is, if we have some -module structure on , can we get a linear map from it? We definitely can: define by merely setting where denotes the scalar multiplication of by , provided to us by the -module structure! Then, simply by the conditions we impose on how a “scalar multiplication rule” must behave, it follows that is linear.

If we think about it for a second, we realize that the *submodules* of the module obtained from the linear map are precisely the sub*spaces* of which are *invariant* under , namely, the subspaces such that for all , or stated another way, . When we *diagonalise* a matrix by finding a basis consisting of eigenvectors, what we’re effectively doing is understanding how the associated linear map’s domain is made up of a bunch of *one-dimensional invariant subspaces *(the *eigenspaces*). Since we know this is not possible in general, we deduce that these modules will not, in general, admit a decomposition into one-dimensional submodules. It’s interesting to think about how properties of the matrix, like its characteristic polynomial for example, are encoded in the algebraic properties of the resulting -module…

Canonical form theory for square matrices over a field falls out as an easy consequence of structure theory for certain kinds of modules (to be precise, the “finitely-generated modules over principal ideal domains”). Recall that we previously mentioned the interchangeability of the concepts of “abelian group” and “-module”. Since is one of the first examples of a “principal ideal domain”, the celebrated structure theorem for finitely generated abelian groups (and its special case for *finite* abelian groups) is also a *special case* of this theorem on modules! So, aside from the formality, one could almost argue that you were essentially doing some primitive, well-cloaked module theory in Linear Algebra 2.

To close off this quick initial glimpse into module theory, I will mention one more place modules crop up: a branch of mathematics called the **representation theory of finite groups**. Loosely speaking, a **representation** of a group is a way of viewing the group as some set of matrices acting on a vector space.

The Yoneda lemma from category theory tells us that contemplating how one algebraic structure *acts* on others can yield profound revelations about the object itself: for a concrete example, Cayley’s theorem in group theory says that every group “is” just a permutation group of some set, and this lies at the heart of why we study *representations* of groups, *modules* over rings, and so on.

Formally, a representation is a group homomorphism where is the “automorphism group” of , or in undoubtedly more friendly language, the *set of invertible linear operators *. It turns out that, in much the same way we whipped out a module over a polynomial ring in one variable to capture the “essence” of a linear operator on , we can construct a pretty natural ring from the group , known as the **group ring** or **group algebra**, denoted . Basically, you consider the set of all finite “formal sums” of elements in with coefficients from and define a multiplication on it by using the group operation of . Then it turns out that representations of and -modules are (just like abelian groups and -modules) *completely interchangeable concepts*. Of course, the analogy becomes a bit more complicated if you move to, say, the representation theory of topological (for example, Lie) groups, since then you need to introduce a kind of “analytic version” of the group algebra.

Anyway, I’ve barely scratched the surface of all the interesting questions you can ask. Thanks for reading, and again, feel free to leave questions or comments.

I’m a bit curious about the part about “representations”. How exactly can groups be viewed as a set of matrices/linear operators on a vector space? Also, would it be necessary for the group to be abelien to have a representation (since you were talking about the equivalence of Z-modules and abelien groups)?

When we say that a representation is a rule for “viewing a group as a set of operators acting on a vector space”, what we mean (precisely) is that a representation is a group homomorphism for some vector space . It’s a way of “mapping” the elements of to (for concreteness purposes, let’s say) matrices. But, it’s not just

anyway of doing so. It has to behave in such a way thatmultiplying elements in the group corresponds to multiplying their corresponding matrices. However this is exactly what is captured by the requirement that be a group homomorphism. Does that help? Certainly, the concept of a representation makes sense (and is moreover interesting) regardless of whether the group is abelian, recalling that matrix multiplication is in general noncommutative…To elaborate on the relationship between representations of and modules: as I described in the first paragraph, the “scalar multiplication rule” that comes bundled with a vector space gives you a rule for assigning, to each element of the field a group endomorphism of , namely the “multiplication by ” map. The axioms of a vector space dictate that all of these “multiplication maps”

aregroup homomorphisms.Now, shift gears to the setting of representations. We have a group and a representation . All does is give us, again, a way of mapping each element to a linear transformation of , which plays nice with the structure of . See how this is similar to what I mentioned above? Moreover, all of these are

a prioriinvertible, since is a group: we have . However, is not a ring, so it doesn’t really make sense to say that a representation is the same as a -module. We solve this little hiccup by constructing, in the most naive way possible, a ring out of . This is the group ring I described. It is an instructive exercise to verify that modules over this ring correspondexactlyto the representations of on an -vector space.Awesome post and comments! Just a little side point. I would love for data types in programming languages to encompass the algebraic structures here. And I would love for them to be implemented at the low level not as say composite algebraic data types in Haskell. It would be amazing if instruction set architectures allowed for field, ring, group types and ‘action’ operations. I think an algebraic paradigm language would just be freaking awesome! I wonder what style of expressions could be conjured up for it as well (e.g. the analog to if-else-then statements). Haskell is cool and all but I would like something that could potentially rival C/C++ maybe. I imagine things such as physics engines and graphics engines could be implemented better especially if you were to administer a counter ISA to RISC-based ones. Might not be possible … would need to look into all transistor based logic possibilities. 😛 …. But as for ‘algebraicly paradigmed’ physics/graphics engines: Symmetry/invariance in mathematical physics anyone? 😉

…Actually game engines may be a horrible way to talk potential possibilities for a new paradigm although they tend to be one of the few critically performance based pieces of software…

Michael, I thought you did a really great job on this post! Will this be an introduction to a new series of posts building on this from here on out? 😛

Of course, the new big thing in computing is concurrency! *Parallella* for example . Google it.

So I wonder what kind of concurrency methods you would be able to come up with/design/implement thinking algebraically?

*if-then-else conditional expression

.. Yep sorry for the (slightly) off-topic rant there. But I figured since this is not aimed at pure mathematicians I would ramble about idealistic applications.

You should really build on this one though going into more on group rings and power-series rings, ideals, prime ideals, spectrum, varieties, schemes(??), exact sequences, etc., etc, etc.

Please?

We all know you want too. I want to extend my rants as well. And I promise to learn Latex syntax to ‘properly’ contribute to a conversation.

About the latter, I thought about this for quite a bit and here’s my partially completed answer.

For our group , let be one such -module and

take where is over some vector space .

Then we can define a natural representation by:

for each , take

where the structure follows from our group algebra .

In the other direction, suppose we take a representation

where and .

If we let then

wouldn't we get nice group algebra if

or

for free groups?

I still haven't figured out if the form maps to a general linear group

in terms of invertibility, but let me know your thoughts.

Also, where can you find a good raw LaTeX to WordPress LaTeX converter? It's a bit tiresome typing in "$latex" … "$" all the time.

I had a lot of trouble parsing your post; I think you misunderstood the concept of a group ring. This is a topic I intend to elaborate on in future posts, but here’s at least a (much) more clear definition than I gave in the article. Let be a group, and be a field. The

group ring(note that this constructiondoesdepend on the field, not just the group!) is defined as follows: as a set, is the set of all functions such that for onlyfinitelymany . [Intuition: if this seems a bit abstract, you can think of as the “coefficient” of in front of when we write it as a formal sum. The finiteness condition guarantees that these formal sums are finite, in the sense that only finitely many of these coefficients are nonzero.]Since we want to turn this into a ring, we must first define an addition on it: we define (in the obvious way) by putting for all . [The astute reader will observe that the additive (abelian group) structure of the group ring, at this point, is just that of the free -vector space on the

set.] Now, to define , the “formal sum” way of viewing things is much more handy. So we writeand

noting both of these sums are finite, and define

We simply expand out the right-hand side and use the group law in to “multiply” each pair of terms together.

This may look like an awkward, roundabout construction, but the thing to take away from the group ring is

what it does for us: is aringsuch that there is a very natural one-to-one correspondence between thering homomorphismsand thegroup homomorphisms. Here, denotes the group of units of the ring . Thus, noting that , a ring homomorphism is “the same thing” as a group homomorphism . Reformulating thisagain, a ring action of on is “the same thing” as a representation of on .Oops, I just read my post again and realized that what I said about the correspondence seems incorrect. I will look at this again tomorrow. The property satisfied by is something to that effect but it needs to be refined a little; I think needs to be replaced by a -algebra and the ring homomorphisms need to be replaced by -algebra homomorphisms. But since I don’t really want to get into all that, I’ll try to see if I can find a nicer way to formulate what the group ring is doing for us…

Okay, my intuition from last night was correct. Here is why it makes perfect sense to consider -algebra homomorphisms in this setting: if we’re

givena vector space , then a representation of on is not just “any” -module structure on . Indeed, isalreadyan -vector space, so representations of on are the same as those -module structures on whichagreewith the existing -module structure on . Now, recall that an -module structure on is a ring homomorphism . By the discussion above, though, we want for all . Here is the identity element. But this is exactly the statement that be an -algebra homomorphism (i.e. a ring homomorphism that “fixes “), since .To be more elegant, we can simply say that -representations of are the same as -modules, since such a module encodes

bothan -vector space as well as an action of on that vector space. You’ll probably have more luck proving the equivalence of these two concepts now that these subtleties have been ironed out.