For Now, Find Me at Hack Library School

Cononley 09

Hi Everyone!

I’m sure as you can tell from the dates of my last post, that it’s gotten a bit busy for me in library school. Well it has and I’m hitting the home stretch before my awesome library school journey comes to an end. Hold your tissues please.

I do blog, but some of you already know that I was selected to be part of the 2013-2014 Hack Library School contributing writers’ group and most of my blogging is done there. You can check out some of the posts that I’ve written down below for your convenience. I’ll update the list as more are added.

But, feel free to check out the Hack Library School site in general, because there’s a pretty snazzy crew of LIS writers with insights to all that you need to know for surviving library school in one piece.

As school comes to an end, I promise to devote more time to this blog. I’ve got big schmancy wicked plans to turn this little blog into an awesome digital portfolio. What do you think of the new template, by the way?

So check back periodically if you can. And if you can’t you can find me at Twitter where I hang out more frequently with other quirky LIS peeps.

Ciao for now!

-Aidy

Image credit: By Immanuel Giel (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)%5D, via Wikimedia Commons

Subjects Paper

Information Organization

LIS 5703

Professor Michelle Kazmer

Aidy Silva-Ortiz

Subjects

12/09/2013

———————————

Analysis of Class Folksonomy

Chapter 10: Systems For Vocabulary Control

As Taylor and Joudrey (2009) point out in the beginning of this chapter, “…there is evidence that people writing about the same concepts often do not use the same words to express them, and people searching for the same concept do not think of the same words to search for it,” (p. 333). With a class of 30 students, all in the same branch of study (similar audience) and assigned to the same tasks (similar information needs) it is interesting to see how the creation of a “flat classification system, using tags as descriptors” or folksonomy reflects the truth in that statement (Kakali and Papatheodorou, 2010, p. 192).

Different Frames of Mind. Each one of us had a distinct motivation and semantic framework used to help identify the characteristics of a specific journal article. Our motivation for the terms chosen to describe each article encompassed a wide range of areas from the need to describe the aboutness of an article, to following an instruction, relying on the originating source’s controlled vocabulary terms, or simply making the article personally discoverable through a set of terms that are most familiar to us. All of this funnels down to what descriptors or tags we choose to relate our mental concept of the item to that of the actual resource (Guy and Tonkin, 2006).

Controlled Vocabulary versus Folksonomy. Is our class set of descriptors a controlled vocabulary? No. A controlled vocabulary requires a list of terms to have structure and relation to one another and may have some form of authority over those terms. Kakali and Papatheodorou (2010) make it clear that a group of tags  has “no authority control, nor are there selection criteria and instructions for tag generation” (p. 192). In other words no one is responsible for what tag descriptor is linked to what item/concept (with the exception of the Required Descriptors). Since the tags lack authority do they maintain structure? Taylor and Joudrey (2009) divide controlled vocabularies into three groups: subject heading lists, thesauri, and ontologies. Subject heading lists and thesauri are made up of a hierarchy of terms that are broader or narrower and in some way related to one another. They also require authority control over terms used to represent a concept. Ontologies do not use an “authorized term” but do carry a hierarchical structure (p. 334). Therefore, our class descriptors remain a folksonomy.

Descriptors

Collectively, the class created a total of 255 unique descriptors (See Appendix 1.) used to describe 135 articles allocated from the FSU Library web site as part of Contribute Assignments 1,2, 4 and 5. For each article, the student was instructed to include 6 descriptors,  3 of which were required as part of the assignment: paper name (Resource and Description or Subjects), assignment name (Contribute 1, 2, 4, or 5) , and z-name descriptor (to identify the student who uploaded the article). The 3 remaining descriptors were chosen by the student.

Tag Morphology. Reviewing all 255 unique descriptors there were key variations in the term make-up. There were 190 singular terms and 65 plural terms. Word phrases accounted for 204 of the terms with only 51 single terms. Abbreviations made up 26 total terms on the list.

Methodology. To best illustrate which descriptors were used most often and under what circumstances, a word cloud was created using the infogr.am site. All 255 unique descriptors were copy-pasted from RefWorks onto a Numbers spreadsheet. Each descriptor was linked to the number of times it was used to tag articles in the ReWorks’ bibliography list. The spreadsheet listed five sub-sheets titled: All DescriptorsRequired DescriptorsResource and Description DescriptorsSubject Descriptors, and Tag-Related Descriptors. It was then exported as an XLS file and imported to the infogr.am site as a data set. Once uploaded, the word cloud was published as an interactive data visualization tool.

Figure 1.

LIS 5703 Class Folksono

(Source: http://infogr.am/lis-5703-class-folksonomy?src=web)

All Descriptors. For the bird’s eye view of the list of descriptors, all 255 unique descriptors were uploaded to a word cloud (See Figure 1.). From here we can view Golder and Huberman’s (2005) “collective sensemaking” at work, whereby the students’ collaboration to create a large classification system demonstrates both “idiosyncratically personal categories” and those “widely agreed upon” (p. 201).

The most used terms were the most visible and prominent in the cloud. The lesser used terms were harder to discern. The majority of terms were only mentioned once and account for most of the cloud’s make up. The Z-Name descriptors were not incorporated into this word cloud since this was the only descriptor used for tagging the student with the submitted article, rather than to describe the actual article.

Figure 2.

Picture 26

(Source: http://infogr.am/lis-5703-class-folksonomy?src=web)

Required Descriptors. Of all the descriptors, this set of terms was the most prominent (See Figure 2.). The reasoning behind this of course is because without these descriptors, one could not receive full credit for the assignment. These descriptors also played an important role in discoverability, since this helped the professor view how who submitted an article reference (Z-Name Descriptors) and for what assignments (Resource and Description, Subjects Descriptors and Contribute Descriptors).

Figure 3.

Picture 27

(Source: http://infogr.am/lis-5703-class-folksonomy?src=web)

Resource and Description & Subjects Descriptors. If a controlled vocabulary hierarchy was created from this folksonomy, then these word clouds provide good examples for illustrating the difference between broader, narrower and related terms (See Figure 3 and Figure 4). The authority term would be the largest term found in the cloud (Resource and Description and Subjects), with smaller text listed as narrower or related entry terms.

There is also a visual difference between both word clouds. The Resource and Description Descriptors word cloud had more unique terms than in the Subjects Descriptors word cloud. It can be deduced that in the first paper assignment students relied more on their own lexicon than in the second paper assignment. At that time, students would have been inclined to take advantage of  “the social tagging aspect of tagging services…as a kind of feedback mechanism for the folksonomy” whereby one person “observes how others have tagged a resource, [and] are more likely to adopt a similar tagging vocabulary” for similar resources (Sinclair & Cardew-Hall, 2008, p. 16). Once a term was created as part of the first assignment, and as users got used to the tagging mechanics more reliance was placed on others and the system, increasing the likelihood that terms could be utilized more than once.

Figure 4.

Picture 28

(Source: http://infogr.am/lis-5703-class-folksonomy?src=web)

Figure 5.

Picture 29

(Source: http://infogr.am/lis-5703-class-folksonomy?src=web)

Tag-Related Descriptors. Tagging tags about tagging is about as much meta as one can get in an LIS class (See Figure 5). However, this word cloud shows the difference that word order has on descriptors, especially when defining the same exact concept. The term “Tags (Metadata)” was utilized 16 times, while its sibling term “metadata tags” was used 1 time. Inverted versus direct order sequences is part of the ongoing challenge when constructing terms that help make up a  controlled vocabulary schema. Most institutions are doing away with terms in  inverted order since evidence shows that users prefer direct order when searching by key terms (Taylor and Joudrey, 2009, p. 338). However, it is interesting to note the opposite in this example.

Subject Access Systems: A Comparison

Similar to the RefWork tagging scheme, many social tagging sites utilize what is called a “flat folksonomy” one that lacks structure or a hierarchy, where “users are free to attach a tag (or tags) to their web content…according to their own needs…[where] no predefined tags exist to categorize the content,” (Yoo, Choi, Suh & Kim, 2013, p.594). User-generated tagging schemes are flexible for the user in that they allow the user to create tags suitable for various purposes, as seen in the RefWorks descriptor schema. Below are examples of three different subject access systems that share similar characteristics to RefWorks insofar as its ability to help construct user folksonomies.

Subject Access System – CiteULike

CiteULike is quite similar to RefWorks in their intended purposes. Both sites allow users to import article citations to create large bibliographies. RefWorks requires a paid subscription and/or institutional access while CiteULike is free.

Figure 6.

Picture 31

All tags are user-generated and can be used to define the aboutness of an item, pull key terms, optimize discoverability for the user with a large library of article references, or bookmark articles under a series (e.g., “lis5703paper”). Unlike RefWorks, however, CiteULike does not adhere very well to “tagging as is”. Rather the user uploads a new article citation and then submits a list of descriptor tags that are then hyperlinked back to the article in alphabetical order (See Figure 6.). In other words, the system over thinks how tags are displayed. It effectively splits tags that are used as word phrases, unlike RefWorks which keeps word phrases together. Word phrases like “controlled vocabularies” and “recommendation engines” used to describe Lev Grossman’s Time Magazine article on Pandora and Netflix are reorganized  and recategorized under the single terms “controlled” or “vocabularies”, “recommendation” or “engines”. The system benefits most to the user through single term tagging which is simplistic in nature, but limited in scope to similar citation management tools like RefWorks.

Subject Access System – Instagram 

Whereas RefWorks and CiteULike are citation systems for academic publications, Instagram has become the social tagging phenomenon with a panoramic view of the world. Instagram also incorporates a flat folksonomy through it’s collection of tags known as “hashtags”. Users take photos and upload them to the site and in order to gain visibility or popularity (an argument can be made for both) of that image. These tags or hashtags as referred to on the site, use a pound sign (“#”) and can be written as single terms and/or word phrases.The user then tags the photo with any number of what Lawson (2009) would classify as “objective tags”  that describe the content like “#beach” for when photographing the shoreline or “subjective tags” that are not content related like “#wishyouwerehere” for the same photo (p. 577). There are even tag generator apps such as “TagsForLikes” which help to establish tag consistency by the creation of a predetermined set of hashtags dependent upon the subject matter of the photo or a desire to increase followers on the site.

Subject Access System – Pandora 

Unlike CiteULike and Instagram, Pandora is a closed tagging system. It utilizes the Music Genome Project as its controlled vocabulary system. Within the Music Genome Project are a set of tags known as “attributes” that are assigned to each song by a Pandora music indexer. Lev Grossman (2010) of Time Magazine described the process as:

“Every time a new song comes out, someone on Pandora’s staff — a specially trained musician or musicologist — goes through a list of possible attributes and assigns the song a numerical rating for each one. Analyzing a song takes about 20 minutes.”

Because the system is closed, it helps retain the tag quality and discoverability of the song. Since this system banks on getting the right song to the right user every time, it makes the argument to avoid user-generated tagging features that would interfere with song discovery. When a song plays, a short sample list of attributes are shown to the user but not all attributes are displayed since the controlled vocabulary used by Pandora is proprietary.

References

Golder, S. & Huberman, B. (2005). Usage Patterns of Collaborative Tagging Systems. Journal of Information Science. 32(2): 198-208. Retrieved from
http://pkudlib.org/qmeiCourse/files/Golder_usage_patterns_collaborative_tagging.pdf

Grossman, L. (May 27, 2010). How Computers Know What We Want – Before We do. Time Magazine. Retrieved from
http://content.time.com/time/magazine/article/0,9171,1992403,00.html

Guy, M. & Tonkin, E. (2006). Folksonomies: Tidying up tags?  D-Lib Magazine, 12(1).
Retrieved from
http://www.dlib.org/dlib/january06/guy/01guy.html

Kakali, C. & Papatheodorou C. (2010). Exploitation of Folksonomies in Subject Analysis. Library & Information Science Research. 32 (2010): 192-202. Retrieved from
http://www.sciencedirect.com/science/article/B6W5R-4YYVCNF-1/2/08c1ec4e14294cc582df6deab01a717c

Lawson, K. G. (2009). Mining Social Tagging Data for Enhanced Subject Access for Readers and Researchers. The Journal of Academic Librarianship. 35(6): 574-582.

Sinclair, J. & Cardew-Hall, M. (2008). The Folksonomy Tag Cloud: When is it Useful? Journal of Information Science. 34(1): 15-29.

Taylor, A. G., & Joudrey, D. N. (2009). Chapter  10: Systems for Vocabulary Control. The Organization of Information. (3rd ed.). Westport, CT: Libraries Unlimited.

Yoo, D., Choi, K., Suh, Y., & Kim, G. (2013). Building and Evaluating a Collaboratively Built Structured Folksonomy. Journal of Information Science. 39 (5): 593-607.

Appendix 1.

All Descriptors_Data Sets

Quora.com & Authority Control

Quora.com is a social media website that “crowdsources” knowledge and content from its users on various topics. Its entry into the database of user-inputted data is through a search box at the top of the page where the user can browse topics by using keyword and/or phrase search methods to find the desired topic. From there, the user can further explore their topic through a results lists ranked by relevancy and lots of hyperlinks peppered throughout the search list.

What’s so interesting about Quora.com is that it’s authority control/controlled vocabulary is largely left to the user. You create the topic, or add a new question within an applicable topic and from there have the ability to do some series “authority” control by way of the “Manage” button listed below the topic title. Similar to including additional MARC fields on a record that may be searched by more than one name, under this section, users can add “Topic Aliases” to help make it easier for users to search under misspelled words (ie. “Quora” and “Qoura”), close-to words, and other names that a particular topic may go by.

Now the great thing about this feature is that users are able to create topics, topic aliases, descriptions, add geolocation info, merge topics, and delete topics. Now the not-so-great thing about this feature is that users are able to create topics, topic aliases, descriptions, add geolocation info, merge topics, and delete topics. Did you see what I did there? Exactly, many dilemmas can come up with this model including, bias, false information, and even self-promotion as in the Social Media topic page where someone tried to advertise within the actual definition of the phrase “Social Media” that can be discovered when you select the “Manage” link (this was later removed by another “authority control” savvy user):

https://www.quora.com/Social-Media/manage

Quora.com is only as controlled as its users let it be. It does give a good bit of democratic control to the masses and has surprisingly an objective model for controlled vocabulary, because as much as someone wants to screw with the topic page, there is another gallant user who wants to correct it. Call it “user-enabled equilibrium”. In addition, when topics are answered the most relevant and credible do make their way to the top of the results list, that again is do to a crowdsourced voting system, which is a whole other story.

And just for a little “meta” amusement…you may need to log in to view the links, but here is Quora topic page on the Quora.com site, and users can manipulate the aforementioned fields within the Quora topic page under the “Manage” section or look at the innards of all that is Quora by searching it’s page, viewing its policies, as well as, see what’s trending about Quora on Quora and even ask Quora a question about Quora on the Quora website. Pretty neat, huh?

Check out these links:

https://www.quora.com/Quora

https://www.quora.com/Quora/manage

3 Tips on How to Write a Paper That Beats Any Deadline

1. STRUCTURE YOUR PAPER

As soon as you get your school’s assignment instructions:

Open MS Word, Apple’s Pages, or Google Docs and set up your paper by listing all the required sections in your assignment as headings and subheadings. Also include a References (APA) or Works Cited (MLA) section as the last page of the document. Be sure to abide by your assignment’s required format (MLA, APA, etc.)

2. GATHER INFORMATION

As you read various course readings to include in your paper:

Start paraphrasing (recommended) or quoting what you read and add those bits into the appropriate sections (headings and subheadings) of your paper. As soon as you enter this information, go down to the References or Works Cited section and include its corresponding citation in the appropriate format (MLA, APA, etc.) Look for a pattern to emerge during this stage and for sections to flow from one to another. If not, now’s the time to flesh out the paper and add whatever’s missing: introduction or abstract, topic sentences and a conclusion.

Now step away from your paper for a day or two.

3. REFINE YOUR CONTENT

Read your paper from beginning to end. Remove duplicate word phrases, concepts, and grammatical errors. Review the final format and double-check that it follows the correct format (MLA, APA, etc.) and submit away!

Repost from my Instagram account: @wednesdaywritingtips

Day 4 of #HLSDITL 10.31.2013

Happy Halloween fellow #HLSDITL followers!

Today’s post will be brief since I’m writing this as I’m in the middle of taking my Organization of Information Class. We’re discussing authority control and controlled vocabularies, exciting stuff for library folks!

Below are some of the highlights of Day 4:

  • Processed article requests for patrons, order those we did not own in our collection
  • Attended our department’s quarterly meeting and discussed the changes that have taken place since the previous quarter.
  • Entered articles into copy service as a way to track library usage
  • Placed a book on hold for a patron

Other highlights:

  • Created an image to go with my HLS post
  • Scheduled to have my Organization of Information class tonight from 8-10 p.m.

Ultimate highlight to my day?

I met Simba, a golden retriever therapy dog:

image

Day 3 of #HLSDITL 10.30.2013

Feeling a lot better than yesterday, I returned to work with a cup of delicious mint tea, ready to document my day for #HLSDITL :-)

Highlights of my day:

Because we’re a small library operation, I have to manually process overdue fines. So, I ran an overdue items report and checked it against patrons who received a notice and were still past due with their library materials. Afterwards I submitted the fine deductions to be processed by the payroll department.

Sometimes requests will come in for articles found in our print collection. We have a pretty extensive set of rolling shelves that house large volumes of older medical journals. Today, I scanned quite a few old medical articles from large bound journal sets and converted them to PDF files.

I worked on my first draft for my Hack Library School post during my lunch break. I was finally able to submit my first draft for review by the HLS team. I received some initial feedback and am working on its revision in time to be published this Friday, November 1st.

One useful tool I used today?

Pubmed.gov

Often times, when requesting a journal article we do not own, it makes it convenient to search for its PMID (Pubmed ID) in order to quickly process document delivery requests through Docline. If the article is indexed by the National Library of Medicine into the Medline database, the PMID of the article could searched for on the Pubmed.gov site.

Here’s how:
If you know the article’s citation, use the Single Citation Matcher feature on the site.

Go to the bottom of the abstract to locate the PMID.