Metadata Article Index for
Metadata
Articles about
Metadata
Website Links For
Metadata
 

Information About

Metadata




Metadata is used to facilitate the understanding, use and management of data. The metadata required for effective data management varies with the type of data and context of use. In a Library , where the data is the content of the titles stocked, metadata about a title would typically include a description of the content, the Author , the publication date and the physical location. In the context of a Camera , where the data is the photographic image, metadata would typically include the date the Photograph was taken and details of the camera settings. In the context of an Information System , where the data is the content of the Computer files, metadata about an individual data item would typically include the name of the field and its length. Metadata about a collection of data items, a computer file, might typically include the name of the file, the type of file and the name of the data administrator.


WHAT IS METADATA?

Any item of data is a description of something. Metadata is a type of data where the something being described is data. Or, as it is often put, metadata is data about data.
If we consider a particular place in the real world, this may be described by many items of data, for example:
  • 1 “E83BJ”

  • 2 “17”

  • 3 “Sunny”


To make sense of and use this data, it is necessary to have access to some form of description of the sort of data it is, or, in other words, have access to its metadata. So, for example, the metadata for the above three items of data might include:
  • 1.1 “Post Code” – This is a description of the data item “E83BJ”

  • 1.2 “The unique identifier of a postal district" - This is another description of “E83BJ”

  • 1.3 “27th June 2006” – This is another description of “E83BJ”

  • 2 “Average temperature in degrees Celsius” – This is a description of “17”

  • 3 "Yesterday’s weather”. – This is a description of “sunny”


An item of metadata is itself data and therefore may have its own metadata. This might (not particularly usefully) be referred to as meta-metadata. So, for example, “Post Code” might have the following metadata:
  • 1.1.1 “data item name”

  • 1.1.2 “8 characters, starting with A – Z”


“27th June 2006” might have the following metadata:
  • 1.3.1 “Date last changed”


The hierarchy of data, metadata, meta-metadata etc. can go on for ever. Fortunately we have sufficient background knowledge so that we can usually make sense of and use an item of data with access to very little, if any, formally defined metadata. So, for example, with the “Post Code” metadata “8 characters, starting with A – Z” , it would be possible using background knowledge to know that this is a description of the format of a Post Code , without having access to any defined metadata for “8 characters, starting with A – Z”.


LEVELS

As indicated, there are hierarchies of data and metadata. However, any particular item of data may be on different levels of a hierarchy depending on the context. For example, when considering the geography of London, “E83BJ” would be data and “Post Code” would be metadata. But, when considering the data management of an automated system that manages geographical data, “Post Code” might be data and then “data item name” and “8 characters, starting with A – Z” would be metadata.

In any particular context, metadata must be at a higher level of abstraction than the data it is describing. So, in relation to “E83BJ”, the item of data “is in London” is a further description of the place in the real world which has the post code “E83BJ” and is at the same level of abstraction. Therefore, although it is providing information about “E83BJ” (It is telling us that this is the post code of a place in London) this would not normally be considered metadata, as it is describing “E83BJ” ''qua'' place in the real world and not ''qua'' data.


DEFINITIONS

The term was introduced intuitively, without a formal definition. Because of that, today there are various definitions. The most common one is the literal translation:
  • Metadata is data about data.

  • Example: "12345" is data, and with no additional context is meaningless. When "12345" is given a meaningful name (metadata) of " ZIP Code ", one can understand (at least in the United States , and further placing "ZIP code" within the context of a Postal Address ) that "12345" refers to the General Electric plant in Schenectady, New York .


As for most people the difference between data and Information is merely a Philosophical one of no relevance in practical use, other definitions are:
  • Metadata is information about data.

  • Metadata is information about information.


There are more sophisticated definitions, such as:
  • "Metadata is structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities."American Library Association, Task Force on Metadata Summary Report. , June 1999

  • " is a set of optional structured descriptions that are publicly available to explicitly assist in locating objects."D. C. A. Bultermann, Is It Time For a Moratorium on Metadata?, ''IEEE MultiMedia'', Oct-Dec 2004

  • These are used more rarely because they tend to concentrate on one purpose of metadata — to find "objects", "entities" or "resources" — and ignore others, such as using metadata to optimize Compression Algorithms , or to perform additional computations using the data.


The metadata concept has been extended into the world of systems to include any "data about data": the names of tables, columns, programs, and the like. Different views of this "system metadata" are detailed below, but beyond that is the recognition that metadata can describe all aspects of systems: data, activities, people and organizations involved, locations of data and processes, access methods, limitations, timing and events, as well as motivation and rules.

Fundamentally, then, metadata is "the data that describe the structure and workings of an organization's use of information, and which describe the systems it uses to manage that information". To do a model of metadata is to do an " Enterprise Model " of the information technology industry itself.William R. Durrell, Data Administration: A Practical Guide to Data Administration, McGraw-Hill, 1985


Hierarchies of metadata

When structured into a hierarchical arrangement, metadata is more properly called an Ontology or Schema . Both terms describe "what exists" for some purpose or to enable some action. For instance, the arrangement of subject headings in a library catalog serves not only as a guide to finding books on a particular subject in the stacks, but also as a guide to what subjects "exist" in the library's own ontology and how more specialized topics are related to or derived from the more general subject headings.

Metadata is frequently stored in a central location and used to help organizations standardize their data. This information is typically stored in a Metadata Registry .


Difference between data and metadata

Usually it is not possible to distinguish between (raw) data and metadata because:
  • Something can be data and metadata at the same time. The headline of an article is both its title (metadata) and part of its text (data).

  • Data and metadata can change their roles. A poem, as such, would be regarded as data, but if there were a song that used it as lyrics, the whole poem could be attached to an audio file of the song as metadata. Thus, the labeling depends on the point of view.

  • It is possible to create meta-meta-...-metadata. Since, according to the common definition, metadata itself is data, it is possible to create metadata about metadata, metadata about metadata about metadata and so on. Though at first this may seem useless, it can be essential to archive metadata about metadata, for example to keep track of where the metadata came from when merging two documents.

  • These considerations apply no matter which of the above definitions is considered. It's quite useful.



USE

Metadata has many different applications; this section lists some of the most common.

Metadata is used to speed up and enrich searching for resources. In general, search queries using metadata can save users from performing more complex filter operations manually. It is now common for web browsers (with the notable exception of Mozilla Firefox), P2P applications and media management software to automatically download and locally cache metadata, to improve the speed at which files can be accessed and searched .

Metadata may also be associated to files manually. This is often the case with documents which are scanned into a document storage repository such as FileNet or Documentum. Once the documents have been converted into an electronic format a user brings the image up in a viewer application, manually reads the document and keys values into an online application to be stored in a metadata repository.

Metadata provide additional information to users of the data it describes. This information may be descriptive ("These pictures were taken by children in the school's third grade class.") or algorithmic ("Checksum=139F").

Metadata helps to bridge the Semantic Gap . By telling a computer how data items are related and how these relations can be evaluated automatically, it becomes possible to process even more complex filter and search operations. For example, if a search engine understands that "Van Gogh" was a "Dutch painter", it can answer a search query on "Dutch painters" with a link to a web page about Vincent Van Gogh, although the exact words "Dutch painters" never occur on that page. This approach, called knowledge representation, is of special interest to the Semantic Web and Artificial Intelligence .

Certain metadata is designed to optimize Lossy Compression Algorithms . For example, if a video has metadata that allows a computer to tell foreground from background, the latter can be compressed more aggressively to achieve a higher compression rate.

Some metadata is intended to enable variable content presentation. For example, if a picture has metadata that indicates the most important region — the one where there is a person — an image viewer on a small screen, such as on a mobile phone's, can narrow the picture to that region and thus show the user the most interesting details. A similar kind of metadata is intended to allow blind people to access diagrams and pictures, by converting them for special output devices or reading their description using Text-to-speech software.

Other descriptive metadata can be used to automate workflows. For example, if a "smart" software tool knows content and structure of data, it can convert it automatically and pass it to another "smart" tool as input. As a result, users save the many Copy-and-paste operations required when analyzing data with "dumb" tools.

Metadata is becoming an increasingly important part of s and files can be important evidence. Recent changes to the Federal Rules Of Civil Procedure make metadata routinely discoverable as part of Civil Litigation . Parties to litigation are required to maintain and produce metadata as part of Discovery , and Spoliation of metadata can lead to sanctions.

Metadata has become important on the World Wide Web because of the need to find useful information from the mass of information available. Manually-created metadata adds value because it ensures consistency. If a web page about a certain topic contains a word or phrase, then all web pages about that topic should contain that same word or phrase. Metadata also ensures variety, so that if a topic goes by two names each will be used. For example, an article about " Sport Utility Vehicle s" would also be Tagged "4 wheel drives", "4WDs" and "four wheel drives", as this is how SUVs are known in some countries.

Examples of metadata for an Audio CD include the MusicBrainz project and AMG 's All Music Guide . Similarly, MP3 files have metadata tags in a format called ID3 .


TYPES OF METADATA

Metadata can be classified by:
  • Content. Metadata can either describe the ''resource'' itself (for example, name and size of a file) or the ''content'' of the resource (for example, "This video shows a boy playing football").

  • Mutability. With respect to the whole resource, metadata can be either ''immutable'' (for example, the "Title" of a video does not change as the video itself is being played) or ''mutable'' (the "Scene description" does change).

  • Logical function. There are three layers of logical function: at the bottom the ''subsymbolic'' layer that contains the raw data itself, then the ''symbolic'' layer with metadata describing the raw data, and on the top the ''logical'' layer containing metadata that allows logical reasoning using the symbolic layer.



IMPORTANT ISSUES

To successfully develop and use metadata, several important issues should be treated with care:


Metadata risks