Theoretical Background of Paragogy

Corneli, Joseph A. 2014. Peer Produced Peer Learning: A Mathemathical Case Study. PhD thesis. Milton Keynes, The Open University.

The term paragogy (literally, "para-" alongside, "-gogy" leading) is used here to characterize the critical study and practice of peer produced peer learning, adapting the classical concept of pedagogy and the relatively recent notion of andragogy (Knowles, 1968) to a peer learning context. The need for a theory of this nature was articulated in dialog and collaborative research with Charles Jeffrey Danoff (Cornely & Danoff, 2011a, 2011b). (Corneli 2014: 33)

What a neat term! I can only quibble with the minute semantics of the word, as the "peda-" in pedagogy and "andra-" in andragogy designate the subject of leading. In effect, paragogy would signify something like "distinct from leading". I imagine my own way of learning being closer to this interpretation: more often than not I learn not from from university courses and lecturers but despite these factors; that is, on a path distinct from where we are supposed to be lead. But the meaning given to it here works well, too, I guess (I don't yet know what is peer produced peer learning).

What if every participant specified their level of commitment in advance? Building a "contract" with the facilitator - and with the community - would delp ensure that people would make appropriate commitments (and keep them). (Corneli 2014: 34)

Is it even possible to specify the level of commitment in advance? My university enables students to register off from a course in two weeks. These first few weeks are viewed as a testing period for everyone to decide if they want to be committed to the course or not. I imagine this is common practice. Our unofficial seminar on autocommunication also disintegrated after two weeks but that may have been due to poor timing (at the end of the semester).

At the same time, from a paragogical perspective, learning is not an automatic default; it is a specific sort of highly conditioned side-effect. One may be stuck in a rut. A peer should not be conceptualized as someone who is stuck in the same rut. Rather, people who learn together or from one another, who help one another get a better grasp of the world, are understood to be "peers." (Corneli 2014: 36)

"[...] each individual may hold a small piece of information which is entirely useless to him alone; but when he is in contact with a group [...] the isolated pieces of information fall into place. [...] The codification of information is divided among many people, and only when the mosaic is put together does it become significant." (Ruesch 1972[1953]: 74) And: "[...] the gradual interchange of information and successive correction leads to the establishment of correspondence of information between A and B, which state might be called "understanding"." (ibid, 86)

In a more detailed interpretation, "Context" could be decomposed into nested, overlapping, or adjacent contexts. (Corneli 2014: 37)

This is almost what I'd like to do with Jakobson's context component when I get to it. I'd like to resuscitate Albert Gehring's contextual function (in The Basis of Musical Pleasure, 1910), although that would take a lot of effort (mostly because Gehring himself does not use the term, but that is how S. K. Langer characterizes his efforts). A more accommodating parallel is within the Marty/Tynyanov syn- and autofunction. In short, the synfunction concerns "adjacent contexts" (from "syn-" meaning place together) and the autofunction concerns, well, not exactly nested or overlapping contexts but other similar contexts that are not necessarily adjacent. I now realize that this is almost like the distinction between enigmatic and paradigmatic situations in that the autofunctional context has an already established set of rules to it, while the enigmatic situation is understood through the broader social context.

Within the field of education, Marlene Scardamalia's notion of "collective cognitive responsibility" (Scardamalia, 2002) is sympathetic, although it remains basically provisionist. Her claim is that in the standard classroom model, "all the higher-level control of the discourse is exercised by the teacher." Students are "reactive" and "receptive,", their work focused on "tasks and activities." Small groups and decentralized, constructive, computer mediated communication are seen as two possible alternatives that "turn more responsibility over to the students." (Corneli 2014: 38)

This chapter is full of notions that sound so awesome and yet feel distinctly out of my grasp. I've made the same distinction between the reactive style of American students as opposed to the receptive style of Estonian students before, although in different terms. Actually, I got into Jakobson thanks to a course dedicated to him in which the lecturer acted very little; the students made almost all the presentations and there was a palpable "collective cognitive responsibility" to make sense of Jakobson's work. It almost felt like the kružok's Jakobson himself created in his lifetime (e.g. Moscow Linguistic Circle and Prague Linguistic Circle).

Our best experiences as course organizers happened when we were committed to working through the material ourselves. (Corneli 2014: 39)

This sounds so true to life. I like lecturers who at least seem terribly interested in what they are teaching. In the end it rubs off. Among the local lecturers there are several iterations of the idea that you don't really teach the students the subject. You teach them enthusiasm towards the subject and they will proceed to study it themselves.

The P2PU governance model was said to be based on "rough consensus," after MIT professor David Clark's description of the Internet Engineering Task Force at the July 1992 IETF conference: "We reject: kings, presidents and voting. We believe in: rough consensus and runnig code." (Corneli 2014: 39)

This sounds a lot like the character of Jakobson's circles. The Prague Linguistic Circle reportedly even had the attitude of "If you can't stand the heat, get out of the kitchen". I imagine "running code" must have been quite important, literally, as they all spoke several languages and had to switch between them constantly.

Bateson's hierarchy Learning I, Learning II, Learning III, etc., begins with Zero Learning, which denotes no change in the subject, only a stimulus and a reaction that is already "soldered in" (Bateson, 1972, p. 288). As with paragogy, Bateson understands learning to be fundamentally related to change, but in his model, he imposes the requirement that the environment should not change while learning is happening. (Corneli 2014: 43)

As I vaguely remember, Bateson's theory of learning was heavily embedded in behaviorism. Let's see the relevant passage:

Note that in all cases of Learning I, there is in our description an assumption about the "context." This assumption must be made explicit. The definition of Learning I assumes that the buzzer (the stimulus) is somehow the "same" at Time 1 and at Time 2. And this assumption of "sameness" must also delimit the "context," which must (theoretically) be the same at both times. It follows that the events which occurred at Time 1 are not, in our description, included in our definition of the context at Time 2, because to include them would at once create a gross difference between "context at Time 1" and "context at Time 2." (To paraphrase Heraclitus: "No man can go to bed with the same girl for the first time twice.")
The conventional assumption that context can be repeated, at least in some cases, is one which the writer adopts in this essay as a cornerstone of the thesis that the study of behavior must be ordered according to the Theory of Logical Types. Without the assumption of repeatable context (and the hypothesis that for the organisms which we study the sequence of experience is really somehow punctuated in this manner), it would follow that all "learning" would be of one type: namely, all would be zero learning. Of the Pavlovian experiment, we would simply say that the dog's neural circuits contain "soldered in" from the beginning such characteristics that in Context A at Time 1 he will not salivate, and that in the totally different Context B at Time 2 he will salivate. What previously we called "learning" we would now describe as "discrimination" between the events of Time 1 and the events of Time 1 plus Time 2. It would then follow logically that all questions of the type, "Is this behavior 'learned' or 'innate'?" should be answered in favor of genetics. (Bateson 1972: 288)
[Bateson, Gregory 1972[1964/1968]. The Logical Categories of Learning and Communication. In: Steps to an Ecology of Mind. Chandler Publishing Company, 279-308.]

As I had guessed, this is indeed rooted in the behaviorist debates of the day. I'm not sure how well the Pavlovian conditioning stuff translates into modern learning theories.

His assertion is that "[t]he notion of repeatablecontext is a necessary premise for any theory which defines 'learning' as change" (Bateson, 1972, p. 296). Bateson does allow a hegde, which is that the context is only "somehow" or "theoretically" the same. The problem is that, here, Bateson is understanding context as "a metamessage which classifies the elementary signal." Rather than talking about a repeatable context, it would be clearer to simply say that the stable classification of signals is what is important. (Corneli 2014: 43)

Is it a good idea to define learning as change? In behaviouristic theories this change was probably understood as a change in the observable behaviour. I find it difficult to imagine learning math as a change in behaviour. "Metamessage" sparked my interest, so I'll turn to Bateson again:

[Continues from the previous passage, on the next page.] We would argue that without the assumption of repeatable context, our thesis falls on the ground, together with the whole general concept of "learning." If, on the other hand, the assumption of repeatable context is accepted as somehow true of the organisms which we study, then the case for logical typing of the phenomena of learning necessarily stands, because the notion "context" is itself subject to logical typing.
Either we must discard the notion of "context," or we retain this notion and, with it, accept the hierarchical series - stimulus, context of stimulus, context of context of stimulus, etc. This series can be spelled out in the form of a hierarchy of logical types as follows:
Stimulus is an elementary signal, internal or external.
Context of stimulus is a metamessage which classifies the elementary signal.
Context of context of stimulus is a meta-metamessage which classifies the metamessage.
And so on.
The same hierarchy could have been built up from the notion of "response" or the notion of "reinforcement."
Alternatively, following up the hierarchic classification of errors to be corrected by stochastic process or "trial and error," we may regard "context" as a collective term for all those events which tell the organism among what set of alternatives he must make his next choice.
At this point it is convenient to introduce the term "context marker." An organism responds to the "same" stimulus differently in differing context, and we must therefore ask about the source of the organism's information. From what percept does he know that Context A is different from Context B? (Bateson 1972: 289)

I would prefer the latter alternative because the notion of "context marker" sound much more useful than "context of stimulus [a]s a metamessage". These two may in fact be related, because metamessage is, by definition, a message about message - which could very well be a "context marker" instead of the context of stimulus as such. Still all of this is beyond my understanding, as the Theory of Logical Types (and most everything else related to Russell) goes over my head.

As an important class of examples, day-to-day communication works largely due to the fact that "the stream of events is commonly punctuated into contexts of learning by a tacit agreement between the persons regarding the nature of their relationship" (Bateson, 1972, p. 304). (Corneli 2014: 44)

This sounds awfully lot like communication about relationship (Bateson's μ-function). Let's see:

Of the multitudinous ways in which Learning II emerges in human affairs, only three will be discussed in this essay:
(a) In describing individual human beings, both the scientist and the layman commonly resort to adjectives descriptive of "character." It is said that Mr. Jones is dependent, hostile, fey, finicky, anxious, exhibitionistic, narcissistic, passive, competitive, energetic, bold, cowardly, fatalistic, humorous, playful, canny, optimistic, perfectionist, careless, careful, casual, etc. In the light of what has already been said, the reader will be able to assign all these adjectives to their appropriate logical type. All are descriptive of (possible) results of Learning II, and if we would define these words more carefully, our definition will consist in laying down the contingency pattern of that context of learning I which would expectably bring about that Learning II which would make the adjective applicable.
We might say of the "fatalistic" man that the pattern of his transactions with the environment is such as he might have acquired by prolonged or repeated experience as subject of Pavlovian experiment; and note that this definition of "fatalism" is specific and precise. There are many other forms of "fatalism" besides that which is defined in terms of this particular context of learning. There is, for example, the more complex type characteristic of classical Greek tragedy where a man's own action is felt to aid the inevitable working of fate.
(b) In the punctuation of human interaction. The critical reader will have observed that the adjectives above which purport to describe individual character are really not strictly applicable to the individual but rather describe transactions between the individual and his material and human environment. No man is "resourceful" or "dependent" or "fatalistic" in a vacuum. His characteristic, whatever it be, is not his but rather a characteristic of what goes on between him and something (or somebody) else.
This being so, it is natural to look into what goes on between people, there to find contexts of Learning I which are likely to lend their shape to process of Learning II. In such systems, involving to or more persons, where most of the important events are postures, actions, or utterances of the living creatures, we note immediately that the stream of events is commonly punctuated into contexts of learning by a tacit agreement between the persons regarding the nature of their relationship - or by context markers and tacit agreement that these context markers shall "mean" the same for both parties. (Bateson 1972: 297-298)

I guess Bateson didn't shake off his influences from Palo Alto in the sixties. The bit about character being a social phenomenon is very much the same stuff Ruesch discusses in relation with psychosomatic illnesses. Simply put, sometimes what we call "mental disease" is not an outcome of a pathological mind but a result of communication disturbances within the group (family, friendship circle, working environment, etc.). I notice that he drew on Dewey and Bentley's The Knowing and the Known in this regard (e.g. the talk of "transactions"). The punctuation of interaction was one of the discoveries of Ray Birdwhistell, another member of the Palo Alto group (notice the distinction between postures and actions). And when he says "such systems, involving two or more persons" he is essentially discussing what he and Ruesch earlier termed a communication system. In this sense "context markers" are not much different from what Ruesch calls "metacommunicative instructions".

The pattern by which we should define Learning III and so forth is now relatively clear. Since "what is learned in Learning II is a way of punctuating events" (Bateson, 1972, p. 305) then what is learned in Learning III is a way of punctuating the punctuation of events - and so on. In other words, Learning II is about creating the boundary conditions in which a given pattern of control will be exercised, subjected to the standard caveat that this control is partial. In a paragogical view, punctiation of this sort could also be called context creation - although often more is involved in context creation than just bracketing. (Corneli 2014: 44)

I don't really see the use of the hierarchical approach (stimulus, context of stimulus, context of context of stimulus). It sound like an overly complex way to go about interaction management. Even Erving Goffman's discussion of the involvent idiom would be more beneficial, I think. But all in all it sounds exactly like the kind of stuff that I would subsume under the notion of channel regulation (a better term doesn't come to mind at the moment).

Here, relationship especially means a context for communication, which as we've seen, means a context for learning, and, more broadly, a context in which cybernetic control is exercised. Bateson shows through many examples that humans and other mammals typically communicate about their most meaningful relationships with one another in nonverbal - i.e., analog - ways. He introduces the term μ functions to describe communication acts, whether digital or analog, spoken or unspoken, which are ways of "voicing" relationship. (Corneli 2014: 45)

As much as I would like to protest that the context has very little to do with it, this is essentially correct. Back in April when I wrote a messy essay on Bateson's μ-function I concluded that he deduced this from his and Ruesch's earlier metacommunication, specifying it as metacommunicative instructions about contact (whether messages pass through, whether they are understood and whether to terminate the contact). Just like with Jakobson's phatic function one steps on thin ice when trying to make sense of the exact nature of the relationship (in Jakobson, the character of contact). From what I found in his collaboration with Ruesch I'd say that he means "relationship" in the very transient sense of "communicative relation", e.g. the fact of communicating at the moment; rather than what we understand as "human relationships", "social relations" or, here, "meaningful relationships". In short, the relationship I think he has in mind for the μ-function is about mutual awareness and influence rather than, say, friendship, intimacy, etc. I may, of course, be mistaken. And in any case it would be interesting to consider both options and if he does mean it in a transient sense, it can be expanded in order to consider more lasting relationships.

Paragogy not only "eschews pipeline models of transmitting knowledge" (Papert & Harel, 1991) but treats communication as an assemblage with emergent properties, constituted by its participants, and subject to the vagaries of existence in the real world. In this respect, we find paragogy at once in both its (em)phatic and productive senses. (Corneli 2014: 48)

I'd suggest the communication system approach of Ruesch and Bateson. Not only because it accounts for these aspects but because it proposes a network (or rather several networks) instead of the pipeline.

In short: paragogy is concerned with generating and voicing the μ functions that sustain relationships in which learning can take place, and through which meaning can be made. (Corneli 2014: 48)

I see the connection between paragogy and the μ-function (or phatic communion, communization, etc.) but I have as of yet very little idea how the μ-function could be elaborated into something like a μ-algorithm.

soul searching or just looking for fights

·