What is a module?

Here is my first effort to explain, in basic terms, the concept of a “module”. I tried to make it accessible to those not studying pure math, while still remaining interesting to those who are. I have no idea how well this can actually work in practice (or if it even did work); hopefully people will just skip over any terminology they don’t understand instead of giving up on the whole article. Better yet, ask me questions in the comments!

In linear algebra courses, we learn about vector spaces: these are algebraic structures where you can add two vectors (v+w), or scale a vector by some amount (\lambda v). In applied treatments, these “scalars” \lambda are usually tacitly assumed to come from the real or complex numbers (\mathbb{R} or \mathbb{C}); it is seldom mentioned that the “scalar multiplication” of a vector space is really an action of a field \mathbb{F} on an abelian group V.

If you’re puzzled by this last phrase, don’t worry. The word action merely indicates a rule for assigning to each element of \mathbb{F} a transformation (indeed, a group homomorphism) of V, in a particularly nice way: to each \lambda \in \mathbb{F} we associate the “multiplication by \lambda” map V \to V given by the rule

v \mapsto \lambda v.

To indicate the field of scalars explicitly, we might call V an \mathbb{F}-vector space or a vector space over \mathbb{F}. To summarize, an \mathbb{F}-vector space is a set V, equipped with a rule for adding two elements of V, as well as a rule for scaling elements of V “by” elements of \mathbb{F}.

ring is, loosely speaking, a structure in which we can add and multiply elements, which satisfies most of the usual arithmetic laws. Fields are a very special kind of ring: the multiplication of a field is commutative (ab =ba), and every nonzero element a has a reciprocal 1/a. Since they have so many wonderful properties, they are much more “rigid”: on an elementary level, their nature is far less complicated than that of rings in general (don’t get me wrong, there’s still a lot we don’t know about fields). However, there are rings we deal with every day which (for simple reasons e.g. the lack of reciprocals) are not fields, for example, the ring of integers:

\mathbb{Z} = \{ \ldots, -2, -1, 0, 1, 2, \ldots \}.

Something I’ve been meaning to learn more about for a while are “vector spaces where the scalars are allowed to come from any ring R at all, not necessarily a field“. In mathematics we call these R-modules, or modules over R (or just modules when the ring R of scalars is clear). Hence, a vector space is just a module over a field. They’re immensely useful in the study of rings themselves, and most people usually glimpse them for the first time in a course on commutative algebra (perhaps when they begin as a grad student). Unfortunately, it will be another 8 months before I have a chance to take an actual course in commutative algebra (PMATH 446), and I’m too impatient for that.

One thing we notice about vector spaces is that their structure theory is trivial; it’s about as nice as it could possibly be. Namely, \mathbb{F}-vector spaces are in some sense “completely determined” by their dimension. You’re probably familiar (at least in the case where the dimension is finite) with the fact that, by choosing a basis, any such space can be viewed simply as \mathbb{F}^n.

Intriguingly, when we move from the setting of vector spaces to the more broad world of modules, the “more complicated” personae of general rings (compared to fields) mangles the situation significantly. A lot of our linear algebra, which we were able to develop with elementary methods, is vehemently defenestrated. In particular, it is no longer even true that a basis always exists for a module (in fact this is a pretty rare situation, and such modules are called free). This means our much-applauded concept of dimension doesn’t, in general, even make sense for modules. Nor is the “completely decomposable” nature of vector spaces shared by modules: it’s actually possible to construct huge modules M which don’t even have a single “submodule” (other than the obvious ones \{ 0 \} and M).

For our very first example of some of the ideas modules generalize, let’s talk about abelian groups: these are simply sets equipped with an associative, commutative binary operation +, an identity element, and inverses. Given any abelian group, we can define a rule for “scaling” elements of G by integers, that is, elements of \mathbb{Z}. Namely, for n \geq 0 we define ng = g + g + \ldots + g (n times), simply using the group operation +, and otherwise if n < 0 we define ng = (-n)(-g) (that is, with recourse to the n \geq 0 case). This is a perfectly natural way of turning any abelian group into a \mathbb{Z}-module. On the other hand, it is obvious that if you start with a \mathbb{Z}-module you can just forget about the scalar multiplication altogether, and you’re left with an abelian group. So \mathbb{Z}-modules are the same thing as abelian groups.

If you think back, you’ll probably recall that a large part of linear algebra had to do with linear operators (more concretely, matrices) and doing things with them, like finding their eigenvalues and eigenvectors, characteristic polynomials, determinants, traces, and so on. A lot of work was put into discussing when “diagonalisation” is possible, and how to achieve it. Since we’re not always lucky enough to be able to do this, you probably learned about canonical forms: the “next best thing” to diagonalisation where we usually try to get some kind of “block diagonal” form. Namely, Jordan canonical form, rational canonical form, and all that. So why should we even care about modules? Furthermore, why were linear operators (square matrices) so much subtler objects to deal with than vector spaces themselves?

The following cool idea provides what I believe is an epistemologically satisfactory answer. First, recall that the set of all polynomials with coefficients in \mathbb{F} forms a ring under the usual operations of addition and multiplication, known as the polynomial ring \mathbb{F}[x]. Suppose V is an \mathbb{F}-vector space. I claim that a linear operator T : V \to V is the same thing as an \mathbb{F}[x]-module structure on V. Notice that V is already an \mathbb{F}-vector space, and any action must respect the ring structure of \mathbb{F}[x], so my previous claim reduces merely to saying that a linear operator T : V \to V is the same as a rule for scaling an element v \in V by the element x \in \mathbb{F}[x]What is the obvious thing to do? Well, simply define xv = T(v) for all v \in V, right? Then right away, this gives us a scalar multiplication of \mathbb{F}[x] on V: namely,

(a_n x^n + \ldots + a_1 x + a_0)v = a_n T^n(v) + \ldots + a_1 T(v) + a_0 v.

Here, of course, T^n refers to the composition of T with itself n times, that is, the map (T \circ \ldots \circ T) : V \to V. For this reason many authors will refer to this as an \mathbb{F}[T]-module structure on V, since the indeterminate x is literally acting as T. Okay, so every linear map gives rise to an \mathbb{F}[x]-module structure on V. What about the other way? That is, if we have some \mathbb{F}[x]-module structure on V, can we get a linear map T : V \to V from it? We definitely can: define T by merely setting T(v) = xv where xv denotes the scalar multiplication of v by x, provided to us by the \mathbb{F}[x]-module structure! Then, simply by the conditions we impose on how a “scalar multiplication rule” must behave, it follows that T is linear.

If we think about it for a second, we realize that the submodules of the module obtained from the linear map T : V \to V are precisely the subspaces of V which are invariant under T, namely, the subspaces W \subseteq V such that T(w) \in W for all w \in W, or stated another way, T(W) \subseteq W. When we diagonalise a matrix by finding a basis consisting of eigenvectors, what we’re effectively doing is understanding how the associated linear map’s domain is made up of a bunch of one-dimensional invariant subspaces (the eigenspaces). Since we know this is not possible in general, we deduce that these modules will not, in general, admit a decomposition into one-dimensional submodules. It’s interesting to think about how properties of the matrix, like its characteristic polynomial for example, are encoded in the algebraic properties of the resulting \mathbb{F}[x]-module…

Canonical form theory for square matrices over a field falls out as an easy consequence of structure theory for certain kinds of modules (to be precise, the “finitely-generated modules over principal ideal domains”). Recall that we previously mentioned the interchangeability of the concepts of “abelian group” and “\mathbb{Z}-module”. Since \mathbb{Z} is one of the first examples of a “principal ideal domain”, the celebrated structure theorem for finitely generated abelian groups (and its special case for finite abelian groups) is also a special case of this theorem on modules! So, aside from the formality, one could almost argue that you were essentially doing some primitive, well-cloaked module theory in Linear Algebra 2.

To close off this quick initial glimpse into module theory, I will mention one more place modules crop up: a branch of mathematics called the representation theory of finite groups. Loosely speaking, a representation of a group G is a way of viewing the group as some set of matrices acting on a vector space.

The Yoneda lemma from category theory tells us that contemplating how one algebraic structure acts on others can yield profound revelations about the object itself: for a concrete example, Cayley’s theorem in group theory says that every group “is” just a permutation group of some set, and this lies at the heart of why we study representations of groups, modules over rings, and so on.

Formally, a representation is a group homomorphism G \to \mathrm{GL}(V) where \mathrm{GL}(V) is the “automorphism group” of V, or in undoubtedly more friendly language, the set of invertible linear operators T : V \to V. It turns out that, in much the same way we whipped out a module over a polynomial ring in one variable to capture the “essence” of a linear operator on V, we can construct a pretty natural ring from the group G, known as the group ring or group algebra, denoted \mathbb{F}[G]. Basically, you consider the set of all finite “formal sums” of elements in G with coefficients from \mathbb{F} and define a multiplication on it by using the group operation of G. Then it turns out that representations \varphi : G \to \mathrm{GL}(V) of G and \mathbb{F}[G]-modules are (just like abelian groups and \mathbb{Z}-modules) completely interchangeable concepts. Of course, the analogy becomes a bit more complicated if you move to, say, the representation theory of topological (for example, Lie) groups, since then you need to introduce a kind of “analytic version” of the group algebra.

Anyway, I’ve barely scratched the surface of all the interesting questions you can ask. Thanks for reading, and again, feel free to leave questions or comments.

Advertisements

About mlbaker

just another guy trying to make the diagrams commute.
This entry was posted in abstract algebra. Bookmark the permalink.

10 Responses to What is a module?

  1. I’m a bit curious about the part about “representations”. How exactly can groups be viewed as a set of matrices/linear operators on a vector space? Also, would it be necessary for the group to be abelien to have a representation (since you were talking about the equivalence of Z-modules and abelien groups)?

    • mlbaker says:

      When we say that a representation is a rule for “viewing a group as a set of operators acting on a vector space”, what we mean (precisely) is that a representation is a group homomorphism \rho : G \to \mathrm{GL}(V) for some vector space V. It’s a way of “mapping” the elements of G to (for concreteness purposes, let’s say) n \times n matrices. But, it’s not just any way of doing so. It has to behave in such a way that multiplying elements in the group corresponds to multiplying their corresponding matrices. However this is exactly what is captured by the requirement that \rho be a group homomorphism. Does that help? Certainly, the concept of a representation makes sense (and is moreover interesting) regardless of whether the group is abelian, recalling that matrix multiplication is in general noncommutative…

    • mlbaker says:

      To elaborate on the relationship between representations of G and modules: as I described in the first paragraph, the “scalar multiplication rule” that comes bundled with a vector space gives you a rule for assigning, to each element \lambda of the field \mathbb{F} a group endomorphism of V, namely the “multiplication by \lambda” map. The axioms of a vector space dictate that all of these “multiplication maps” are group homomorphisms.

      Now, shift gears to the setting of representations. We have a group G and a representation \rho : G \to \mathrm{GL}(V). All \rho does is give us, again, a way of mapping each element g \in G to a linear transformation \rho(g) of V, which plays nice with the structure of G. See how this is similar to what I mentioned above? Moreover, all of these \rho(g) are a priori invertible, since G is a group: we have \rho(g^{-1}) \rho(g) = \rho(g)^{-1} \rho(g) = \mathrm{id}. However, G is not a ring, so it doesn’t really make sense to say that a representation is the same as a G-module. We solve this little hiccup by constructing, in the most naive way possible, a ring out of G. This is the group ring I described. It is an instructive exercise to verify that modules over this ring \mathbb{F}[G] correspond exactly to the representations of G on an \mathbb{F}-vector space.

  2. jentrep says:

    Awesome post and comments! Just a little side point. I would love for data types in programming languages to encompass the algebraic structures here. And I would love for them to be implemented at the low level not as say composite algebraic data types in Haskell. It would be amazing if instruction set architectures allowed for field, ring, group types and ‘action’ operations. I think an algebraic paradigm language would just be freaking awesome! I wonder what style of expressions could be conjured up for it as well (e.g. the analog to if-else-then statements). Haskell is cool and all but I would like something that could potentially rival C/C++ maybe. I imagine things such as physics engines and graphics engines could be implemented better especially if you were to administer a counter ISA to RISC-based ones. Might not be possible … would need to look into all transistor based logic possibilities. 😛 …. But as for ‘algebraicly paradigmed’ physics/graphics engines: Symmetry/invariance in mathematical physics anyone? 😉

    …Actually game engines may be a horrible way to talk potential possibilities for a new paradigm although they tend to be one of the few critically performance based pieces of software…

    Michael, I thought you did a really great job on this post! Will this be an introduction to a new series of posts building on this from here on out? 😛

    Of course, the new big thing in computing is concurrency! *Parallella* for example . Google it.
    So I wonder what kind of concurrency methods you would be able to come up with/design/implement thinking algebraically?

    • jentrep says:

      *if-then-else conditional expression

      .. Yep sorry for the (slightly) off-topic rant there. But I figured since this is not aimed at pure mathematicians I would ramble about idealistic applications.

      You should really build on this one though going into more on group rings and power-series rings, ideals, prime ideals, spectrum, varieties, schemes(??), exact sequences, etc., etc, etc.

      Please?

      We all know you want too. I want to extend my rants as well. And I promise to learn Latex syntax to ‘properly’ contribute to a conversation.

  3. About the latter, I thought about this for quite a bit and here’s my partially completed answer.

    For our group G, let M be one such \mathbb{F}[G]-module and
    take m\in M where \mathbb{F}[G] is over some vector space V.
    Then we can define a natural representation G\mapsto GL(V) by:
    for each g\in G, take
    g\mapsto T(g)=\left\{ \sum_{finite\ v\in V}k_{v}\cdot v,k_{v}=\sum_{i=1}^{n}(g^{i}m_{i}),n\in\mathbb{N},m_{i}\in M\ \forall i\right\}
    where the structure follows from our group algebra \mathbb{F}[G].

    In the other direction, suppose we take a representation R=G\mapsto GL(V)
    where R(g)=\sum_{i\in\mathbb{I}_{g}}\lambda_{i}v_{i} and |\mathbb{\mathbb{I}}_{g}|<\infty.
    If we let F[m_{g}]=\sum_{i\in\mathbb{I}_{g}}(f_{i}g^{i})v_{i} then
    wouldn't we get nice group algebra \mathbb{F}[G] if
    \mathbb{F}[G]=\bigcup_{g\in G}\mathbb{F}[m_{g}]
    or
    \mathbb{F}[G]=\bigoplus_{g\in G}\mathbb{F}[m_{g}]
    for free groups?

    I still haven't figured out if the form maps to a general linear group
    in terms of invertibility, but let me know your thoughts.

    Also, where can you find a good raw LaTeX to WordPress LaTeX converter? It's a bit tiresome typing in "$latex" … "$" all the time.

    • mlbaker says:

      I had a lot of trouble parsing your post; I think you misunderstood the concept of a group ring. This is a topic I intend to elaborate on in future posts, but here’s at least a (much) more clear definition than I gave in the article. Let G be a group, and F be a field. The group ring F[G] (note that this construction does depend on the field, not just the group!) is defined as follows: as a set, F[G] is the set of all functions f : G \to F such that f(g) \neq 0 for only finitely many g \in G. [Intuition: if this seems a bit abstract, you can think of f(g) as the “coefficient” of f in front of g when we write it as a formal sum. The finiteness condition guarantees that these formal sums are finite, in the sense that only finitely many of these coefficients are nonzero.]

      Since we want to turn this into a ring, we must first define an addition on it: we define (in the obvious way) f+f' : G \to F by putting (f+f')(g) = f(g) + f'(g) for all g \in G. [The astute reader will observe that the additive (abelian group) structure of the group ring, at this point, is just that of the free F-vector space on the set G.] Now, to define ff', the “formal sum” way of viewing things is much more handy. So we write

      f = \sum_{g \in G} f(g) \cdot g and f' = \sum_{g \in G} f'(g) \cdot g

      noting both of these sums are finite, and define

      ff' = \left( \sum_{g \in G} f(g) \cdot g \right) \left( \sum_{g \in G} f'(g) \cdot g \right).

      We simply expand out the right-hand side and use the group law in G to “multiply” each pair of terms together.

    • mlbaker says:

      This may look like an awkward, roundabout construction, but the thing to take away from the group ring is what it does for us: F[G] is a ring such that there is a very natural one-to-one correspondence between the ring homomorphisms F[G] \to R and the group homomorphisms G \to U(R). Here, U(R) denotes the group of units of the ring R. Thus, noting that U(\mathrm{End}(V)) = \mathrm{GL}(V), a ring homomorphism F[G] \to \mathrm{End}(V) is “the same thing” as a group homomorphism G \to \mathrm{GL}(V). Reformulating this again, a ring action of F[G] on V is “the same thing” as a representation of G on V.

    • mlbaker says:

      Oops, I just read my post again and realized that what I said about the correspondence seems incorrect. I will look at this again tomorrow. The property satisfied by F[G] is something to that effect but it needs to be refined a little; I think R needs to be replaced by a F-algebra and the ring homomorphisms F[G] \to R need to be replaced by F-algebra homomorphisms. But since I don’t really want to get into all that, I’ll try to see if I can find a nicer way to formulate what the group ring is doing for us…

    • mlbaker says:

      Okay, my intuition from last night was correct. Here is why it makes perfect sense to consider F-algebra homomorphisms in this setting: if we’re given a vector space V, then a representation of G on V is not just “any” F[G]-module structure on V. Indeed, V is already an F-vector space, so representations of G on V are the same as those F[G]-module structures on V which agree with the existing F-module structure on V. Now, recall that an F[G]-module structure on V is a ring homomorphism \rho : F[G] \to \mathrm{End}(V). By the discussion above, though, we want \rho(\lambda \cdot e)v = \lambda v for all v \in V. Here e \in F[G] is the identity element. But this is exactly the statement that \rho be an F-algebra homomorphism (i.e. a ring homomorphism that “fixes F“), since \rho(e) = \mathrm{id} \in \mathrm{End}(V).

      To be more elegant, we can simply say that F-representations of G are the same as F[G]-modules, since such a module encodes both an F-vector space as well as an action \rho : G \to \mathrm{End}(V) of G on that vector space. You’ll probably have more luck proving the equivalence of these two concepts now that these subtleties have been ironed out.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s