Archive for the ‘Context’ Category

Finally, a Wiki where Information is Managed Content

August 23, 2010

Finally a Wiki where Articles are Fully Managed Content

Wiki’s are one of the most mentioned Enterprise 2.0 tools.  Most (dare I say all?) E20 vendors have one or incorporate one into their solution stack.  However, if you look more closely, many of them simply roll in some open source wiki server and call it a day.  While basic functionality for wikis is almost standard these days, the information architecture underpinning the wiki is often overlooked.  What happens is that the “wiki-widget” proponents end up sacrificing information availability for information presentability.  The “we’ve got a wiki too” crowd is so caught up in achieving buzz-word parity that the real benefits of a fully managed and integrated wiki solution are passed over.  The result is a loose hodge-podge of stand alone “web 2.0″ widgets that have been lumped together with a common user interface thrown on top.  The vendors call it good.

Portal vendors are some of the worst offenders here.  The ease with which widgets are surfaced in a single common UI lends itself to lazy integration. In these kinds of environments the wiki widget may appear next to the JCR enabled content repository but there is NEVER ANY LINKAGE BETWEEN THE TWO!

Seriously, WTF???  If enterprise Wikis are the best place for enterprise knowledge bases, best practices and employee generated tips and tricks (AND THEY OFTEN ARE!), then what in the world is any enterprise information architect worth his or her pay grade doing being happy with throwing key corporate knowledge assets into its own walled off database silo?  The answer is that most are happy with the loose “on the glass” integration provided by a portal or creative use of iFrames.  This is a tragedy and a terrible mis / under use of corporate knowledge assets.  Fortunately, Fishbowl Solutions has developed a fully ECM integrated wiki that combines all the latest wiki features with the power of Oracle Enterprise Content Management.

(more…)

Type of Personalization in Portals – Content Filtering Personalization

May 10, 2010

Picture somewhat related.

In the first post about personalization in portals we talked about the most common form of personalization, User Personalization.  This is a manual action initiated by the user to tailor the experience on a site to their personal preferences.  This is great but it does not leverage some of the inherent benefits of using portal technology with an ECM system like UCM.

So in this post we will talk about the 2nd kind of personalization in portals, Content Filtering Personalization, as well as outline a solution for doing this type of personalization in a JSR-168 standards based portlet consuming content from UCM.

(more…)

If E20 is the Shiznit, Why are We Still Using Email?

April 26, 2010

Presentation given by Billy Cripe at Collaborate 2010
Las Vegas, NV
April 21, 2010

Types of Personalization in Portals – User Personalization

March 29, 2010

In the white paper we posted on 3/9 (Integrating ECM with Portal Technologies) I wrote a section that gave an overview of the 3 main types of personalization that are normally implemented in a portal environment.

  1. User Personalization
  2. Content Filtering Personalization
  3. Trend Analysis Personalization

In a short series of posts over the next few weeks I will go into a bit more depth on each type that I mentioned in the paper including technical details when applicable.

First up is User Personalization. (more…)

File Naming, GUIDs, Duplication, Identity and Metadata: A Response to John O’Gorman

February 12, 2010

I love social network conversations. There has been a great one going on over in LinkedIn. In the AIIM Group for Intelligent Information Management a conversation was started around whether or not file naming conventions were needed when we have robust EDMSs (enterprise document management systems). John O’Gorman, an Information Integration specialist made a provocative post (in the spirit of great dialog) and I responded. The answers and debate have grown and now, rather than take up the whole form, I am posting my reply here, so that you may participate as well!

If you have not read the thread, you can do so HERE (if you want to skip to the billy vs john debate go to page 3). Without further ado, here is my reply:

I appreciate the engagement and invite others into the fray. I think this makes us all sharper! So in the spirit of mutual enlightenment and the disputational interrogative we engage!

We start off on common ground agreeing that humans are *much* better suited at pattern recognition and discrimination than are programs. While they can process vastly larger quantities of information, we can identify and consume “relevant” information more efficiently (at least now).

1) You mention that a computer cannot have even 2 files with the same name in a folder while we can pick out one from many quite easily. I agree with the example you give but the question was not answered and your answer imports some assumptions that aren’t necessarily so. Let me explain. I would argue (based on my pop-sci understanding) that human discrimination is facilitated by the way our brains “tag” memories with unique identifiers. We use electro-chemical “naming conventions” programs use other conventions. Same fundamental strategy though. To this extent ,the argument that the strategy that computer programs use is bad because it is different than what human brains use fails. Showing a difference in result does not impugn the process, merely the efficiency or execution or a host of other limiting factors. Secondly your example imports some assumptions that do not hold. Why do you assume the windows file system uniqueness requirements? I work with EDMSs that can store N number of files with the same file name in the same “location” and display those in a single collection and slap a “folder” icon on top of it. Computer programs? Yep. Windows file system limits? nope.

2) Maybe I didn’t understand when you originally stated, “the reason we put meaningful labels on anything is because no one has come up with a better alternative”. This sounded to me like you considered this the worst of no possible alternatives. I am not sure why you feel this way. As you say, it seems to work pretty well for our brains and while search isn’t up to our levels yet, it is getting there. I would also argue that basic keyword indexing (tokenization) has limitations. But aren’t you creating a straw man argument here? Why must disambiguation happen at this level? Search engines also incorporate (and are increasing their incorporation of) other weighting/prioritization/relevancy axes in order to achieve disambiguation and increase relevancy. This is why entity extraction and ontology assisted querying is so promising. Before going there though, disambiguation and prioritization happens through incorporating inbound linking, folksonomies, usage/consumption patterns and other factors. Bibliographic tracing is quite popular in higher education systems in order to figure out which concepts are derivative and which are foundational. Computer programs are doing the tracing and therefore the discrimination here.

3) I grant that, as you say, two different GUIDs only assert that the two resources to which they are associated are deemed different. But again you brought up the distinction between systems and humans. A picture of (the same) JaneSmith (age 3) and JaneSmith (age 33) are very different and *depending on your prior relationship with here*. The difference may be enough to prevent your identification of her as the same person or (if you are her father) not enough to confuse you even for a moment that it is the same person. This raises two very important questions around meaningful (aka relevant) difference and identity. With a person we presume continuity of something that transcends cellular existence (personality, soul, spirit, whatever). With information what do we have? Changing a single byte in a document between versions alters checksum values thereby creating an entirely new item (by one interpretation). But that one byte is not likely a meaningful difference and so we are comfortable with maintaining the ID of the item. So hyper content centric identity of information seems not to be useful. Alternately I can create an empty “unique id” for an item in my EDMS and then proceed to associate that ID with an image of JaneSmith, then a resume for JaneSmith then a video of the Space Shuttle. That unique ID retains its uniqueness among all other items in the system but is a container for “nothing – image – document – video”. The ID is still able to be distinguished from all others in the system. The difference is the intentionality with which it is assigned. Humans import definition by creating the ID and therefor create the uniquess regardless of what the “thing” is the identifier points to. This is the extrinsic identification rather than intrinsic identification. I think there is a place for both kinds of identity in our world but EDMSs generally focus on and utilize extrinsic identity. This is because we can import purpose more easily to extrinsically identified objects (since we control them) rather than intrinsically identified objects (which we have to find a use for).

4) I agree whole-heartedly that, as you say “There is some philosophy in every human endeavor, just as there are some mathematics in every computer”. I, like you enjoy engaging on this level as well! So props to us and the readers who enjoy it. I think this is an important area where we can elevate the discussion and practice in our community. So I’ll bite again. While I love talking about how interior angles of a triangle can add up to 270 when laid out on a sphere, I’m not seeing the information science analogy but I am excited to hear it! In the mean time, I do not follow your LA example. You say that “without being told ‘Los Angeles’ (GUID h34dh23a4b7b33c8361) is different than ‘Los Angeles’ (GUID 7b33c8361h34dh23a4b) a computer is ignorant of that difference.” But I would reply that the *fact* of the different GUIDs is the discriminating factor for a computer. So by definition a computer *must* “know” that title attribute “Los Angeles” of GUID 1 is different and distinct from title attribute “Los Angeles” of GUID 2. The trouble comes up in how those GUIDs were assigned. If extrinsically assigned (i.e. though an act of intention) then we the human consumers of that information assume/rely on the idea that the difference is meaningful. If intrinsically assigned (e.g. automatically via a crawler or something else) then we get into potential duplication / overlap / synonym problems (e.g the checksums were different but the objects were not meaningfully different). The trouble I have with this is that meaning is always imported by humans and is dependent on the scope of our problem domain. Google Earth can show the globe or my back yard. At one scope of the problem domain (e.g. where do you live?) both the “glob” and “my back yard” answers are correct. To a space alien the globe provides the appropriate level of meaning. To my mother the back yard provides the appropriate level of meaning. So by setting up the question the way you have you seem to be assuming a common scope which is only ever extrinsically identified and therefore unable to be held in common.

5) I think the “assigned to maintain” vs “derived to describe” strategy is very good and the best part is that they are not mutually exclusive. I agree with you that these are, “an interesting twist on randomly assigned object identifiers”. But these are quite common. They are simply at different levels of the problem domain. OWLs, most EDMSs and other relationship maps do this all the time. Using your example, ‘Management Salary Policy’ and ‘Management Gehalts Politik’ and ‘Política del Sueldo de la Gerencia’ and ‘???????? ???????? ??????????’ would share a common identifier that acts as a meta-identifier. The reason for this is that (I assume) you are using a localization example where we have 4 different translations of the same policy. In this case we humans understand that the collection, the set, is a common set and should be related and identified with a common identifier. The difference at the set level is not meaningful. At a deeper level the language difference becomes meaningful only after the set has been identified and located. Here is where we start delving into Derridan concepts of differance and what is meant by the identity. Suffice it to say that this is the realm of extrinsicly assigned (e.g. derived to described) identification.

6) Here we’re getting down to brass tacks. I agree that search is a big pain point for most organizations. AIIM, Gartner, Forrester, Ovum, Gilbane and others all agree. But in most EDMs don’t care if multiple files with the same name are stored. I should be clear, by files with the same name I mean “file name” (e.g. BlogPost.doc or MyPresentation_Final.pptx or LinkedInReply.txt). This is because the EDMSs will store that filename as an attribute but give the file a GUID. The good EDMSs (like Oracle UCM) will provide a set identifier (ContentID) that identifies the set of revisions which may be substantially different from each other. Each revision has it’s own unique identifier (dID). Furthermore, ContentIDs can be auto generated, auto derived from rules / extraction processes or manually created each and every time. Additionally content objects can be associated with each other along N number of axes for intentional purpose-based collecting/discovering/location/identification. These axes may or may not be indexable. This allows information discovery (classification, grouping, categorization) to be brought along side information location (querying, retrieving) rather than relying simply on one or the other or requiring a serial approach of one then the other. I fundamentally disagree with your statement that, “nor is it considered best practice in an EDMS to encourage or even allow contributors to randomly assign their own metadata.” First, no user ever “randomly” assigns their own metadata. At least not in true random form. Second your statement begs the question of what is metadata and what should end users create and consume? So are Flickr tags metadata? Yes. Should users create them and be enabled to create them? YES! Are star based ratings on blog posts metadata? YES. Should users be allowed and empowered to engage with rating systems? YES! What about “comments” or “descriptions” or “due date”? I would argue that in contextually appropriate situations users should always have at least the option if not the requirement to add descriptive and intentional metadata to their content objects. The more that entity extraction systems (e.g. OpenCalais, GATE, CLARABRIDGE etc) become commonplace the easier we can make it for people. But I do not foresee a time when information creators will be able or should stop describing and classifying what they have created or consumed.

7) You write that “This issue of whether or not to have a convention for naming files is symptomatic of a systemic problem.” I agree whole heartedly. But I disagree that this means that all systems fail to solve the problem. Indeed I am very confident that with technologies like the Oracle ECM system and Fishbowl Solutions add on modules such as our Subscription notifier, Workflow solution set, CollabPoint and Advanced User Security Mapping along with our solutions like Contract Management, Policies and Procedures, Admissions Office Onboarding and Research Solutions that we can and do solve the business problems around file naming conventions, information location and efficiency boosting.

So I will echo your sentiment at the end, with which I agree unequivocally: In the spirit of the community of intelligent information management,I’m just sayin’…