The Minciu Sodas laboratory.  Materials for the draft of an import/export standard for aggregates of notes.

Issues

The materials below are in the public domain, mostly taken from the archives of our working group ourownthoughts@egroups.com which all are welcome to view.  I am organizing them by issue, along with proposed answers and open questions.  I invite you to send me your comments at ms@ms.lt.  Andrius Kulikauskas


Why record and organize thoughts and their relationships?
How does our thinking affect us?
What do we need to reexperience our thinking?
How does our experience of thinking affect us?
What is the benefit of our standard?
What is the nature of our standard?
What are we modeling?
What are some basic aspects of thinking that we should model?
What do we variously need to reexperience our thinking?
What are the global constraints that we typically apply as we structure our thoughts?
What formats should we use?
How should relationships (links) be understood?
Where are they stored?
What is there structure?
How do we deal with additional features?
What will we name our standard?


What are the global constraints that we typically apply as we structure our thoughts?

This question is the basis for our investigation "Linking Locally is Thinking Globally", or in the words of TheBrain website, "thinking by linking".  The idea is that by choosing to organize my thoughts within a given structure, such as a tree, I am shaping my thinking.

An example where I myself have done this is the development of the Thoughtful Wishing matrix.  I looked through all of the examples we've collected as to what kinds of tools for thinking we would like, and on a blank sheet of paper jotted down the different kinds of wishes that I found, so that each wish was a handful of words.  Then I used Microsoft Paint to write out the wishes on my computer screen, and placed related wishes next to each other.  The two-dimensional screen seemed adequate
for this, but any network of bi-directional links would have suited my purposes, so long as I could later visualize it as a map.  (It would have been nice to switch back and forth between TheBrain and a map-type visualization).  I was then able to observe within the map certain kinds of regularities. By organizing my thoughts in this bi-directional network I was able to look for and notice certain kinds of patterns that I would not if I was organizing them in other ways or not at all.

The format that I envisage has us (or our converter) identify the structural type of our link, whether it is a non-directed network (NN), or directed network (DN), or unordered hierarchy (UH), or ordered hierarchy (OH), and so on.  We really need some kind of empirical basis to be able to say what kind of structural constraints occur.

In the TopicMaps movement there are collections of semantic link types, often thousands of types, such as X "IS A PART OF" Y, or X "IS IN" Y.  I will want to review one or more collections to figure out what kinds of structural constraints can be inferred, and which I imagine are much fewer in number.

I also want to compare this with the kinds of thought organizing tools that we find to exist, the kinds of structurings that they offer. I will invite tool makers to become members, and instead of the $300 membership fee, allow them instead to provide us with $900 of their software.  We will distribute this software to members of our working group as you help us with our work.  In this way, our laboratory can grow and strengthen our efforts.

So, to the kinds of structural types we find above, we will add any others that you use, both as tool users and as toolmakers.  I will then claim that these are the kinds of structural constraints that we find in practice. [Andrius Kulikauskas, 7/00]

How do we decide which proposal is best?

We need to know what we are trying to express, and consider how well various expressions serve us. In March, Ben Darnell wrote an email proposing a way of expressing a thought and its relationships, based on what he uses for Thoughtstream.  His proposal was very interesting.  It got me thinking, how do we decide which proposal is best?  I think this is the kind of question that algebraic semiotics considers.  We need to think about what we are trying to express, and then we can consider how well various expressions serve us. [Andrius Kulikauskas, 5/00]

What formats should we use?

Answers: Excel, .CSV, XML, TopicMaps

There need to be several formats.  Otherwise we and others will confuse our "modeling language" with the formats that we use to implement it.  I am pursuing the Excel format as a pragmatic solution for my own personal needs.  I welcome and encourage alternative formats, such as the XML format that Ben Darnell (Thoughtstream, http://thoughtstream.org) and Stephen Danic (Lucid Fried Eggs) have used for conversion.  We can learn from competing formats, what are they best for, and we can develop conversion between them.  Where possible, we need to identify our own needs for conversion, and address those needs now. [Andrius Kulikauskas, 7/00]

We are our own best customers.  As a user, I want my own needs to be met.  I am currently using Netscape Composer, but would like to be able to use TheBrain, MindManager and other tools for working with my thoughts.  I want to find a practical solution that I myself am happy with, and then it will be possible to share this solution amongst ourselves and with others, and see who likes it, and who does not. With this outlook, I have asked our investigator Raimundas Vaitkevicius to "make this so".  He is free during the summer, is interested in working with Java, Small Talk, Basic, and other languages, and is interested in organizing programmers.  So I will devote $500 a month so that he can accomplish this.  I have in mind a simple six-column flat-table format that would meet my needs.  We are both excited to realize that Excel is our vehicle of choice for this format.
Excel spreadsheets are perhaps the most basic way that individuals and even IT departments exchange tabular information.  At my old job we used to receive rather large tables as Excel spreadsheets, and then within Excel we followed a semi-automated protocol we had devised to clean up the data: check for consistency, root out duplicate records, change field names, parse names and addreses, etc..  Excel is designed for this kind of "hands-on" examination and reformatting of the data.
Furthermore, Excel is for our purposes conceptually unintrusive, emphasizes the tabular form of the data, and does not have any visible format, so the "modeling language" aspect of our work can stand out. Excel encourages a wide range of sophistication of user ability, from manipulating rows and columns, to running macros, to programming in Basic, to integrating with other applications.  I know that this will meet my needs and encourage me to experiment with the many tools for
thinking.  I think by satisfying our own needs with this format, and having it available, we will have something to distribute that will allow us to see who needs our solutions.  I think that this format can be contagious and allow us to organize an open source movement that would develop an Excel-based interactive multiconverter with an ever growing set of macros, instructions, and guidelines. Inspiration for the use of Excel comes from our member Caspar van Beek, who became a member by sharing in the public domain a converter that he designed for exporting information from MindManager to Excel. Also, several respondents to Raimundas' survey "Do you organize your thoughts?" indicated that they use Excel as a tool in its own right for organizing their thoughts.  [Andrius Kulikauskas, 7/00]

TopicMaps.Org  I have spoken with Steve Pepper of the TopicMaps movement.  He spoke of a need for TopicMaps Templates, specializations of the TopicMaps that would make their use practical in different areas of life.  In general, most of what we want to do can be expressed with TopicMaps (with perhaps the exception that an association is not a topic, at least not as the standard is presently defined, although it does seem to be evolving).  So we can try to use the TopicMaps formalism to express our modeling language as a TopicMaps Template, where we introduce a topic type Thought, and association types DirectedNetwork, NondirectedNetwork, OrderedHierarchy, etc. given by the structural types that we find.  As we develop our own working format, construct our own modeling language, and try to describe both one and the other with TopicMaps,
then we will have a creative tension that will let us learn from the TopicMaps experience, but also diverge from it, if needed.

Excel  You've probably noticed I've gotten excited about the use of Excel spreadsheets as a flat-table format for our standard.  That's because it's a solution that appeals to what I personally want, some kind of straightforward and personal control over my data.

Of course, we can phrase our standard as a Topic Maps Template, etc., etc. But as far as actually having to deal with import/export in real life, an Excel spreadsheet is very down-to-earth, and I think that among actual users, a large portion of them will see it this way, too. Of course, we can take a poll.

I want to credit Caspar van Beek for bring our attention to the use of Excel.  Originally, I thought this was quite odd, but now I think it's genius.  [Andrius Kulikauskas, 7/00]

I propose that we find users of TheBrain who have or would like to have such work patterns, and develop converters via an Excel format to facilitate these patterns.  [Andrius Kulikauskas, 7/00]

This might be nitpicking but I have to say it.  Don't use Excel as your default data format or your intermediary conduit.  Stay well away from any proprietary data format.

The whole point of a standard is to make it universally accessible. Excel doesn't fit this criteria.  [Steve Danic, 7/00]

Would that be true if one only used the 'grid' function of excel?. So no excel functions, just data in cel A1..ZZ2000.  [Caspar van Beek, 7/00]

It undermines the efforts and the culture of free software programmers everywhere. It supports a software company which practices illegally and harms smaller organizations. The more data that's in excel, the more small software authors have to worry about that as an import format.

I'm not sure what you are trying to achieve by putting information into spreadsheets, since spreadsheets are primarily effective as calculation tools.

If you feel you must use a spreadsheet format, use comma delimited text (.CSV) instead. It's the de-facto import standard for all spreadsheet programs. It's supported in far more applications than Excel is.

You'll notice that I have a .csv export on memes.net. It's every bit as good as having an excel export, but it doesn't force other software authors to write excel import filters.

Of course, I don't have any import filters, so it's tough to get modified data back into Lucid.
[Steve Danic, 7/00]

In my opinion, if the expressive power of .CSV format is sufficient to be a [prototype]  standard exchange format - let it be the basis for converters. Because, to use .CSV files with Excel software is as simple as to open  .TXT files with MSWord, and the restriction to save/import each sheet (prepared with Excel, etc.) separately is acceptable. [Saulius Maskeliunas, 7/00]
 

How should relationships (links) be understood?

Where are they stored?

What is there structure?

How do we deal with additional features?

Andrius Kulikauskas, 8/22/00:         I think we should allow for optional additional columns, I think this is very important.  In general, I think we should encourage toolmakers to design their export functions so that they include all of the aspects of thought as additional fields.  For example, TheBrain can export "datecreated", or MindManager can export the X and Y coordinates.

Ben Darnell, 8/22/00: This way madness lies.  Suppose TheBrain and MindManager each define an extension column, which would be the column immediately after the standard columns.  If one tried to read a file produced by the other, it would get confused because it would see the (x,y) coordinates where it expected a date, or vice versa.  You could define a means for
identifying these extensions, but then you're moving towards reinventing
XML.  It would certainly be easier to write a parser for this language
than for XML, but the key fact about XML is that you don't have to write
a parser - you just use an off-the-shelf parser with an interface for
your language-of-choice.

I am focusing more on "manual modeling" rather than on "automated transfer".  So my feeling is that, for example, exporting from TheBrain we would like to at least make available all of the information.  It is easy to throw extra columns away, at least from a spreadsheet, where you just delete the offending columns. (I imagine with XML this is more difficult, especially if you don't have an editor).

So exporting, I don't see a problem of having additional columns appear, making explicit all sorts of information that a thought or relationship may have, such as: date created or edited, position, font, color, file, synchronization, etc..  Also, I think we would want to encourage users to create their own extra columns to experiment with and manipulate their thoughts in creative ways.  For example, these extra columns may be extremely helpful in breaking up one large brain to create two smaller ones.

However, on importing, what we would strive for is that the first six columns (the required columns) would be imported, that is: the name and content of the thought, the relationships between the thoughts, and that this would be done respecting and preserving the structural link type (Closed Sequence, Open Sequence, Unordered Hierarchy, Acyclic Network, Directed Network, Nondirected Network would be the reserved structural link types).

Also, I would except of a good pair of converters that, whatever can be exported can also be imported.  So, for example, if we can export from MindManager fifteen columns worth of features (six required columns and nine additional), then we should be able to import the same fifteen columns.  The user should be able to know that if they prepare the information correctly for these fifteen columns, then they will be able to do the import.  A good import converter should make this information available.  Also, I think the burden should be on the import converter to let the user know if some information will not go through, will get lost.

The burden on the user is, especially, to understand and make sure that the information they want to import has the proper structural link types.  This is because I can then use tools creatively.  For example, I can map the Unordered Hierarchy within MindManager to either the Directed Network (child-parent) of TheBrain, or the NonDirected Network (jump) of TheBrain.  So the burden is on the user to make such a change.  This is not difficult with a spreadsheet, but more complicated operations, like turning a network into a tree, will need the help of an interactive multiconverter.

The above is what seems to me very practical and needed.  Or is it?  It doesn't rule out a parallel XML version.  Also, it doesn't rule out building further agreement on the added features, like Color, that might be harmonized.  Although then, as you indicate, it would be better to rely on column name instead of position.

I admit that there's a tough question of how do we formulate the required columns, so that we don't run into trouble later.  Do we just, for example, reserve the first six, or do we have the first record provide column names, and the second record provide, for example, version information?  Or do we do both?  What names do we choose for the columns, and what names or codes do we choose for the reserved link types?

I don't deny that these thoughts can ultimately lead to "reinventing" XML.  If they do so, then they help explain why XML is the way it is, and let us do something simpler, when we don't need something that complicated.  I showed to Steven Newcomb how our spreadsheet format helps explain to people how you end up using TopicMaps, that is, how they "unfold".  He's very supportive of the value of having a conceptual continuum between such a "simplest" conceptual framework and something as sophisticated as TopicMaps.  That's the direction that I want to offer to sponsors, is that with their sponsorship we can collect use cases that illustrate this continuum, and thereby show the value of solutions they can offer up and down the continuum.

In particular, I think it's wonderful for TheBrain, and likewise Thoughtstream, if the technology is applicable (scalable in terms of sophistication) to large portions of the continuum, rather than nail it down to a particular spot, so that TheBrain is just a vehicle for TopicMaps.  The standard that we are developing is so simple and elegant that, not only does it allow for the greatest freedom, but it makes clear to the user how sophistication can get introduced step by step.  At least that's my goal.  What is, or would be, your goal?
[Andrius Kulikauskas, 8/00]

What will we name our standard?

Ideas include:
ThreeBook, BrainRack, MindSet, IrDAKiss  [Andrius Kulikauskas, 8/00]
Roy Roebuck calls his metaschema tree-star-flow.  [Andrius Kulikauskas, 8/00]



UNDER CONSTRUCTION


    Joseph Goguen's work is very relevant in terms of thinking how sign
systems, user interfaces, and other entireties can variously represent
an underlying conceptual model.  Joseph, I suppose your work doesn't
presume an underlying conceptual model, but doesn't negate it, either.
My experience is that there exist conceptual models that we might live
directly but can reflect on only through particular representations.
     Abstract algebra, however, may serve as an example.  A group is a
mathematical system with the basic requirements to support addition and
subtraction.  Examples are the integers, but also the hours on a clock,
or the ways of flipping and flopping a rectangular piece of paper.
Typically the members of a group are considered actions, like moving the
hour hand so many hours ahead.  In order to talk about the group
concretely, write down expressions and use an addition table, we end up
using a system of symbols, and talking about the "group action" on the
set of symbols that we are able to write down.  My understanding is that
such a system of symbols can always be identified with a representation
of the group in terms of matrices, that is, the actions are identified
with matrices, the addition of actions is matrix multiplication, and the
rows and columns of the matrix can be identified with symbols, so that
the matrix multiplication yields equations amongst these symbols.
      In this analogy, a group would be an underlying conceptual model,
but we never work with the group directly, at least not on paper.  On
paper we work with matrix representations, which are like our "user
interfaces" for the group.  Instead of "conversion" from one user
interface to another, I would like us to encourage the transfer back via
the conceptual model.  I wonder if this might relate with your very
intriguing concept of "hidden algebra"?
Also, it would be great to get more answers to what you wish for, what
you would like to do with your thoughts, and why think at all.  Thank
you to William Wagner!  I intend to glean some answers from our surveys
"Do You Organize Your Thoughts?"
 
 

Opinion: Synchronization
Ben Darnell,
I don't think syncing should go in the standard itself (at least not at
first).  ThoughtStream has the capability to perform a
bi-directional sync between any two of its data files.  Is there any
interest in a generalized sync protocol, or should syncing remain an
application-dependant task?

For example, suppose I want to keep the "Intellectual Property" node from
memes.net and it's children (out to three generations) on my Palm (using
ThoughtStream).  With just the import/export standard, I could export a
subset of the Lucid data and import it into TS.  However, that data includes
some large, infrequently-modified documents, so I would rather not pull all
of it down the wire every time I sync.

Syncing is a hairy problem, and we certainly want to be able to exchange
data with apps that don't support it, so it shouldn't go into the core
standard.  However, it can be useful, especially when crossing platforms, so
it might be interesting to extend the ThoughtStream syncing process to
encompass other programs.  Alternately, perhaps there is a more general
solution, such as SyncML.
 
 

On Mon, Apr 03, 2000 at 11:48:15PM -0700, Andrius Kulikauskas wrote:
> - in doing so you do not create a separate table for relationships,
> which would elevate them to a class on par with thoughts, breaking your
> 0-1-infinity rule.

The 0-1-inf rule refers to numbers of equivalent objects, not different
classes of objects.  The decision to use a single table for everything was
not driven by theoretical concerns, but for ease of implementation
(especially on the Palm) [Ben Darnell]

However, there is one aspect of the ThoughtStream data structure that might
be breaking this rule.  Association roles are defined as "parent", "child",
or "peer".  I'm not entirely comfortable with specifying a particular set of
relationships, but I think it's OK in this case.  A link can be either
directed or undirected, and if directed it can point to either the head or
the tail.  Therefore, there are only three possible link types.  Have I
overlooked some more exotic link type which should be included here?  Should
the standard be relaxed to allow an arbitrary number of link types (OTOH, if
the standard does not specify values for the xlink:role attribute, is there
any reason to use both xlink:role and xlink:title?)?


Notes

The notes below are from a rough draft for the directional proposal "Exploratory IrDAKiss" and they are making their way upward.

Context: Setting our civilization right-side-up
The desire to organize thoughts
The lack of tools
The reason for the lack
The Need for the Standard
Why we haven't had a standard
The Implications of the Standard
The nature of the standard
The way to develop the standard
The need for exploratory and ultimate standards
********************
Context:  The role of Infrared
What differentiates Infrared
The pressure on Infrared
The conceptual solution for Infrared - social act
The value of the conceptual standard
********************
Conceptual Standard
Import/Export requires constraints
Importance of Clarity for Users
Conceptual Pitfall: Semantics and Tagging
Conceptual Pitfall: Visualization
Constraint: punctuate thought
Constraint: prompt
Constraint: relationship
Constraint: structural type
No Constraint: can be invalid, inconsistent, self-contradictory
********************
Conceptual Definition
Definition of an aThought
Components
********************
Implementations of the Standard
Importance of Clarity, Transparency
Implementations in Transport Standards
Implementations of aThought
Implementations of aPack
Implementations in Software Tool Export
Issue: attributes

********************
Protocol
Straightforward Import/Export of an aPack
Integrated Import/Export of Object Store of aThoughts
Interpretation of other objects as aThoughts
Complicated negotiations


Context: Setting our civilization right-side-up

We are taking up a mandate to change the world, and we will.  If anyone who comes to us and says, "Hey! I'm an independent thinker and I need this because of..." then they can be half-sane and I am willing to think things from scratch to serve them.  We are going to make something so conceptually strong that we will always be able to go back and look at why it is the way it is.  But I'm not going to worry the least about anonymous users who are only viewers, never authors, of thoughts. Microsoft and thousands of other companies are serving them happily.  After we figure out how to serve the needs of authors of thoughts, then
we can worry about the tactics of implementation that will encourage all to act as authors.

I don't care about what's "acceptable to the larger group of users".  I care about what's acceptable to me and the people that I am aware of. The larger group of users, doesn't need us, not yet.

There is a small but wonderful minority of people who need us badly.  They are the authors of thoughts, the independent thinkers.  We want them to be happy, we want those who serve them to be successful, and we want everybody to see that they are independent thinkers, too, as much as they choose to be.

The latter case is where we are authoring for others, rather than for ourselves.  Our standard is for supporting people who want to, first of all, author for themselves, and only then for other people.
 
 

The desire to organize thoughts

Why organize thoughts?  Why use software - in fact, more generally - why use structures for organizing thoughts?  The answer is psychological: different kinds of writing promote different kinds of thinking.  Arranging ideas in a sequence helps us distinguish strong and weak ideas, in a tree - broad and narrow, and in a network - vague and clear.

Collect use cases  We want to pay the greatest attention to respecting these psychological effects, which are the reason why we use software for arranging and organizing our thoughts.  We need to collect use cases to understand what thinkers are doing.  We need to analyze - critically - what existing software tools and standards allow users to do, and more importantly, how thinkers might put them to creative and unexpected use.

The lack of tools

The reason for the lack

The Need for the Standard

Problems with existing standards

Why we haven't had a standard

The Implications of the Standard

encourage knowledge workers to use software tools to accumulate, arrange and reflect on their ideas/notes, knowing that converters based on our standard will keep notes from getting trapped in products.

allow knowledge workers to transfer their notes back and forth between software tools to explore and enjoy the benefits of various tools and their combinations.

The nature of the standard

The way to develop the standard

The need for exploratory and ultimate standards

I would like us to make as rapid and pragmatic progress as
possible to develop an import/export standard for sequences,
hierarchies and networks of notes.  I propose we look at how
existing software products and standards use these structures,
and then draw up a simple experimental format that can handle
transfers between these products.  With the help of
interested programmers in Lithuania, we can start creating
converters and experimenting with transfers.

On the other hand, to design anything _very_ good takes a lot of time, and sometimes it is reasonable to seek for solutions which can be implemented and applied in practice much more sooner.

Context:  The role of Infrared

What differentiates Infrared

The pressure on Infrared

The conceptual solution for Infrared - social act

The value of the conceptual standard for Infrared

The Infrared Data Association has established a special interest group Flow of Experiences to create an infrastructure to accumulate, heighten, reflect on and respond to experiences.

A thoughtful format for aggregates of information may expand the flow of ideas and experiences across Infrared connections in the way that HTML has expanded the movement of documents across the Internet.

Conceptual Standard

Our main goal is to have conceptual standards that are clear
- for the users
- for the programmers

Not necessarily software.  A human standard, not a computer standard.  (It doesn't even have to be software, it could be index cards, for example.  Somebody in a Lithuanian village could be working with index cards, but if they follow our conceptual standard, there will not be any special difficulty in transcribing them into a computer.  Or a computer could print out a set of such index cards).

Compare with object technology, and with Unified Modeling Language.

Import/Export requires constraints

This kind of idea suggests what our standard will mean for the user: If the user abides by the constraints of our paradigm, then our standard will ensure that the user will be able to import/export their aggregates of thoughts across a wide variety of platforms, not favoring any particular visualization.

Importance of Clarity for Users

Conceptually clear to both users and programmers  My first concern therefore is to think through an elegant conceptual standard that will be conceptually helpful to both users and programmers.  I've worked with John Harland on database projects where we've decided to take several days to rename all of the tables and fields so that their names would not cloud our thinking.

In the same way, object technology exists solely for conceptual reasons, that something like it is necessary for humans to work together on large ongoing projects.  On the plane home, I met a scientist from NASA working on the hunt for life.  They define life as "the transduction of energy into creation and maintenance of high information density in complex structures".  In the case of object technology, these are structures of objects, and in the case of accumulating and digesting thoughts, these are structures of thoughts.  Love is that which supports life, and we users - caring authors of robust notes - support the
generation of rich structure by being conceptually clear.  By being conceptually clear we show no favor to possibilities, we "love all our children equally".

Tools need to have clarity of what they are.  Andrius proposes to prepare  the ontology of thinker's tool  at first, and the corresponding format afterwards.  But EACH tool for thinkers has own terminology and theory (or ontology) about what is most important for possible users.

Conceptual Pitfall: Semantics and Tagging

No need to tag your own thoughts.

Not interested in meaning, but in using structural aspects of software tool.
Be able to abuse tool for our own purposes. First, I want to bring up a big problem with preserving attributes like fonts.  We want to allow for transfer between the widest variety of tools, that is, any tool that may be used or abused for organizing thoughts.

Conceptual Pitfall: The Right Visualization

Visualizations are not Structurally Neutral
Visualizing sequences, hierarchies and networks involves restructuring them.  Each visualization has its advantages, but none of them is structurally neutral.  This is a great difficulty, mostly in working with each other, because people favor the visualization of their favorite product.  A goal for our standard should be that it allow and encourage transformation of information between the six visualizations.
I include sample tools: Example - Structurally neutral, relational database table the rows are unordered.

A usual goal, in a viewer driven world, is to see the same data but to think about it differently.  In an author driven world, we want to think the same data but to see it differently.  So we're concerned about things that we don't see.  For example, to my knowledge there is no way of visually (without the use of abstract labels) distinguishing between a tree with unordered branches, and a tree with ordered branches.  But there is a tremendous difference in thinking.  We get a handle on this
difference by using prompts that our environment supplies us to remind us.  So we are very unhappy if those prompts work against us.  If those prompts encourage us to think of a sequence as ordered (say an ordered list in HTML), when it was originally unordered, then that is very upsetting.  I think our standard should only focus on what we think - what we're aware of as authors - and ignore anything that we see but do not think.

Hidden relationships
These products can be used in another sense, where we do not focus on visualizing ideas, but editing relationships between them, without visualizing them, working blind, in the dark.  This is what unifies the various relationships.  Also, doing the transformations will presumably "hide=not show" some relationships and "uncover=show" others.  We need our import/export standard to be faithful to these hidden relationships, unsupported by the individual product.

Conceptual pitfall: Hyperfaithfulness

Ben:  "One more thing to consider:  In some programs (e.g.ThoughtStream) links and other sorts of content are stored together in a user-defined order. In Topic Maps, associations are kept separate from occurences, so this ordering would be lost.  Is this important?

Saulius: "All aspects of information represented in any original format (which will be transformed into another format) should be not lost in the standard intermediate format. E.g., if font sizes are used in original format,  they should be expressed in the intermediate standard form, too. Then, if the target format supports font sizes, too - they can be assigned (correspondingly to font sizes used in the original format).

In fact, degradation of information can be very helpful for thinking, and forcing faithfulness can be very unhelpful.  Just because font sizes are used in the original format, doesn't mean that the author chose them.  In fact, Microsoft Word is a horribly bad editor for hypertext documents because it is hyperfaithful, preserving information absolutely, where it was only intended relatively, if at all.  Trying to preserve such peripheral information, even when it is secondary, can interfere with thinking by raising its importance.  Saulius, what use cases support your position, for example, that font sizes should be preserved for organizing thoughts?  Are these, for example, absolute or relative font sizes?

If to talk about font sizes, some tools support them (e.g., all word processors, LotusNotes apps., etc., etc. - absolute; html - relative; TheBrain - both [i.e., relative - in plex window, absolute - in "notes" field of each thought]), others - not (e.g., Andrius' tool, MindMan v2.1 1996).

I will pick on Microsoft Word to make a point.  I used Microsoft Word as a hypertext editor because they had a free one early on, so thank you to Microsoft.  But I had very strange results with things like fonts.  My impression of their approach is that they always assume my main concern is that everything always look as it appears in Microsoft Word.  For example, as an author, I might choose a heading size <H3> as a third level in a hierarchy.  But they might typically conclude that what I really want is to use Times Roman size 14, so they will encode that information as part of the document. So a Microsoft Word HTML file may be much bigger than a regular HTML file because it has so much filler. We don't notice that filler until it creates problem, for example, when I export to another hypertext tool that uses as its default Helvetica font but some pages appear in Times Roman simply because they were originally created in Microsoft Word.  And there's no way to tell Microsoft Word "snap out of it, turn off your fonts!"  Just like the Microsoft Word typo fixer can be a real pain.  Or their smart quotes. All of these things are very problematic for authors who prefer WYSIWYG - what you see is what you get.  Or better yet: there is nothing more there than what you can see.  As an author and thinker, I want to be in control, I don't want there to be things that I don't know about.  At the least, I want to feel that I am the one who has been making the choices, even if implicitly.  Our import/export standard needs to reenforce that sense of authorship.

Conceptual Pitfall: Faithful Retranslation

The quality of natural language translation sometimes is evaluated by translating back the translated text into the original language, and comparing it with the source text.  Similarly, when SHN standard will be ready, it will be possible to
convert the set of thoughts in, e.g.:  TheBrain format(1) -> intermediate standard format -> MindJet format, and MindJet format -> intermediate standard format* -> TheBrain format(2)  If the correspondence of thoughts(2) to thoughts(1) will be
satisfactory, then the intermediate standard will be considered as good;  if not - then it will be considered as weak.

I think it would be very prudent to allow a round-trip out and back into the same representation, without much degradation. This means supporting ordering (whether intended, helpful or not), attributes, or any other information that the source form regards as important. I know this would make the standard much more complex (by allowing negotiated or private information), and there will be a trade-off to be made.

So while I appreciate that the final outcome of any transfer will be limited to the lowest common denominators of both sides, I think the truncation should occur on import, not export, otherwise the importing side cannot make its own choices. And given that we are dealing with tools that are not merely informational but also attempt to represent insight and understanding, it would be a pity to discard more of the "textural content" than is absolutely necessary.

It will be a challenge to have no degradation going from product A to our standard S and back to the product A, and most likely there will be degradation going all the way to another product B and back.

Saulius: "If the transfer is between systems of similar functionality - then 1:1 transfer is more universal, more acceptable to a larger group of users."

I think we should assume and focus on the case where systems are of very different functionality.  I think if we can do meaningful transfer between TheBrain, MindManager and Thoughtstream then we are doing our job.

Streamline environment, writing for oneself
- I can worry about the font size as it appears to me.  In that case I would try to adjust my browser/editor.  But then, as Saulius writes, it's a quick fix.  So it's not that important, for authoring, to preserve the font size, when they are all the same.  It can be whatever the default is, and then I can change it.

Conceptual Pitfall - go through all information

 

Constraint: punctuate thought

 

Constraint: prompt

Constraint: relationship

Constraint: structural type

Constraint: ID

Constraint: attributes

It is definitely unsafe to make assumptions about what is helpful or unhelpful in a person's thinking. We live and think in a spatial, colourful and texture-rich world (notwithstanding sensory impairment). Tools which capture and exploit these attributes make it easier to represent thoughts meaningfully, and to recognise them qualitatively or subliminally (e.g. using Red for important things). While it may be helpful in analysis processes to take a new view of information by optionally filtering these
attributes, I also suggest that homogenisation significantly retards recognition and response, and that a useful part of the data is lost. We don't just recognise a thought by its name, but by where it is, how big, what colour, what it's near to etc.

You would be surprised at how much difference even just the font can make interpretation. The same statement in different fonts can appear friendly, casual, authoritative or even as SHOUTING :-)

Your points are very helpful, so I'm looking for how we can best handle this.  I think Tony Buzan's (the inventor of mindmapping) work is very relevant here.  He writes about concepts and principles he has noticed that are relevant for organizing thoughts.  When I was in North Carolina, I gave Ben Darnell a beautiful book by Barry Buzan on these concepts, and I hope to find another copy soon.  These include, as you write, the use of color, different fonts, etc. to accentuate ideas,
presumably help bring out differences between them, and how they might be structured.

Instead of "marking up" such attributes - the traditional reaction in the XML world - I think it would be great to bring them out as thoughts in their own right, so that "red" is - first of all - not a physical color for a font, but rather a conceptual prompt.  It may be certainly be a very important prompt for a user, especially one who uses a color system, but I imagine the exact physical colors do not matter so much, but rather that there exists some way of expressing that attribute as a conceptual prompt.  The author - in doing import/export - might have to decide how the new software would express a color like "red", or
a particular font, or a font size, which may or may not make sense in the new context.  I feel that for our standard we should make import/export feel like part of the authoring process.

My feeling is that if we do justice to what it means to be an author - drawing from Tony Buzan's work, that of others, and use cases - and we transport this relative to the new context, then this is much better than trying to stick absolutely to attributes (such as color, font, font size) in the old context which may very possibly be destructive unless they are rendered thoughtfully in the new context.

I'm afraid this would lead to an increasing vocabulary of link types or thought types in the standard (thought "A" linked to "Emphasis" which is in turn linked to "Red", with these links or thoughts marked in some way so that the software recognizes them as instructions for the display rather than simple data).  I think it is appropriate to leverage existing standards here, such as XHTML/CSS or XSL[T].  Using stylesheets lets you separate semantic markup from display characteristics: <thought><name><xhtml:em>A</xhtml:em></name>...</thought>

Here, I agree - and thanks to Saulius and Nick for bringing this up - that when differences in fonts are used to accent thoughts then we should respect those differences.  How will we deal with these cases?

Even if we stick to software, the visualizations that various tools use are so different that it's not meaningful - and often terribly
distorting - to transfer absolute font sizes.  A software that visualizes thoughts as emmanating from a center often uses very tiny
fonts at the periphery.  A software that displays thoughts in sequence, as a word processor does, can use very large fonts.  A font size that makes sense in one program has no reason to make sense in a different program.

As Nick pointed out, font size, font type, colors, images and such can play a vital role in MindManager, and I understand their use in accenting thoughts, helping to remember and appreciate them, to be part of Tony Buzan's theory of mindmapping.  But these are things that we do think.  For example, a blind person should be able to read that mindmap, should be able to note, "Hmm. they deliberately chose red for this, blue for that."  Yes, the colors are chosen just so to prompt the author, but I think that even more important than the physical color is the mental prompt that it serves.  So the author may think "RED" is a color I
reserve to have certain significance as a prompt.  For example, Edward de Bono has a theory of six thinking hats: the red hat, the blue hat, and so on.  So maybe "RED-BONO" is a more appropriate prompt.  Or in a different mindmap it may be "RED-COMMUNISM".  Now there may also be a preferred RGB number for that red, or a preferred font for communism.
But I think a good goal for our standard would be to first worry that the attribute (say, color) is exported in the abstract - as a thought in its own right, say "RED" or "RED-1" or "COLOR-1".  If the author does not and will not elevate it as a thought in its own right, then I think we have no business exporting it.  Secondarily, when it is exported, it may carry in the associated thought information on how it was originally expressed.  But the author should have control - during the import/export process, which may often require the help of an interactive converter - to import into a new program in such a way that "RED" is expressed in an appropriate manner, perhaps as a sound, or a certain kind of pulsation or motion, or a dashed line, or a different
color, or even just a word.  Import/export of such attributes will depend on the converter knowing that product X offers such and such physical attributes for the author to use, and the author deciding how to map the mental attributes they use to the physical attributes.  (In a sense, each product has its own "stylesheet language" which the author can make use of.)  Or the author may treat the attribute as just another thought.  In this way, any thought can be transformed by the author into an attribute that the program he or she is importing to makes available.

No Constraint: can be invalid, inconsistent, self-contradictory

Conceptual Description

Components

In general, I imagine each thought as a relationship between two thoughts, possibly null.  The thought is as simple as possible, which is to say, the thought is made up of five "fields":
  1. - the thought (content)
  2. - the prompt (ID)
  3. - the prompt of the thought it takes us from
  4. - the prompt of the thought it takes us to
  5. - the type of the relationship

Interpretation of Nulls

Viewer is not null, that means there is a means to decode the thought.
Viewer is null, that means there is nothing to decode.

Address is not null, that means there is additional information to be retrieved to explain further, and the thought is a comment on that information.  The thought is what is gained, released simply by breaking down the thought into a question and answer..
Address is null, that means that the thought is self-explanatory.

Prompts need not be unique
We don't need namespaces because the information is where it is.  So at one level there is no possibility of confusion because you have what you get.  So prompts do not have to be unique.

Implementations of the Standard

Importance of Clarity, Transparency

Conceptually clear to both users and programmers

Implementations in Carrier Standards

Implementations in Software Tool Export

Issue: attributes

Usage Protocol

Straightforward Import/Export of an aPack

As we do this, we should rather quickly get a pretty good
picture of the kinds of subsystems that come up.  Conversion
(from product A to our format to product B) can then be
broken down into the following issues:
 
 

Integrated Import/Export of Object Store of aThoughts

Interpretation of other objects as aThoughts

Basis for negotiations

Future versions