L697 - Advanced Topics in Information Systems:
Formal and Relational Concept Analysis
School of Library and Information Science
Indiana University
Summer I 1997
Instructor: Uta Priss
Email: upriss@indiana.edu
Office: 022 SLIS
Office phone: 812-855-2793
Office hours:
Introduction
Formal Concept Analysis is a fast-growing, relatively new field. It was
introduced by Rudolf Wille in 1982. Since then more than 250 papers on the
subject have been published, including several textbooks and conference
proceedings. It provides a method
of formal data analysis which has successfully been applied to many
fields, such as medicine and psychology, musicology, linguistic databases,
library and information science, software re-engineering, civil engineering,
ecology, and others.
(An extensive bibliography of Formal Concept Analysis
can be found
here.)
A main advantage of Formal Concept Analysis as a tool for formal
data analysis is its capability of producing graphical visualizations
of the inherent structures among data. Especially for social scientists,
who often handle data sets that cannot fully be captured in quantitative
analyses, Formal Concept Analysis extends the scientific toolbox of formal
analysis methods. Statistics and Concept Analysis complement each other
in this sense.
In the field of information science there is even a further application:
the mathematical lattices that are used in Formal Concept Analysis
are orderings and can therefore be interpreted as classification systems.
This leads to a new understanding of the structure of classification
systems which can be controlled by the formal representation.
This interpretation of classification structures is compatible with
theories among library scientists (such as Bliss and Shera).
Furthermore, Ranganathan's `facets' are represented as `scales'
in the framework of Formal Concept Analysis.
Formalized classification systems can be analysed according to the
consistency of their relations. Thesauri can automatically be constructed
from classes and their attributes, without having to create a hierarchy
of classes by hand. As an example, an on-line library catalog using the
Conceptual Diagrams of an automatically constructed class hierarchy has been
implemented in the ZIT library in Darmstadt.
Course Objectives
-
To introduce a formal method of qualitative data analysis.
-
To provide practical experience with basic data analysis
techniques, such as selection, grouping and scaling of features.
-
To develop the student's ability to understand the
problems involved in the formalization of `informal' data.
-
To teach practical skills of using the computer software DIAGRAM, ANACONDA, and
TOSCANA.
-
To provide practical experience with
techniques of structuring graphical representations.
-
To provide insights into the formal structure of classification systems.
Class Organization
The course consists of lectures by the instructor and
class discussions. About 50 % of the class time
will be spent in practical training (lab sessions and computer lab sessions).
During the first session students will form teams of two to three
members. Each team will select a topic from a field of interest
(this can be data they used for research in another class or research project)
which appears to be suitable to be analysed using Formal Concept Analysis.
The instructor will provide some topics for groups who do not find
an appropriate topic. The students will develop a formal analysis of their
data using the techniques they learn during the semester. In the 11th
session the groups will present their results to the class.
Readings
Since the existing textbooks on Formal Concept Analysis are either written
for mathematicians or written in German, there will be no textbook for the
class. All readings for the class will be put on reserve in the SLIS
library. The students are required to make their own photocopies of
the two introductory papers (Wille (1996): `Introduction to FCA' and
Wolff (1994): `A first course in FCA') and bring them to class, since these
two papers will be used as the main material for several sessions.
The readings should be read before the session to which they are assigned
according to the class schedule.
Grading
The final course grade will be computed for each student on the basis
of grades assigned for the following:
Class contribution | 1/3
|
Group Project | 1/3
|
Final Exam | 1/3
|
Each student is expected to complete all course work by the end of the term.
A grade of incomplete (I) will be assigned only if exceptional
circumstances warrant.
Class contribution
Class contribution does not mean attendance, but the quality and quantity
of contributions to the work of the class. Comments and questions are
equally valuable if they help to clarify the topics
and to move the discussion forward. The assignments and readings
of each week must be completed before the class meeting so that
substantive and meaningful contributions from the students are possible.
It is required that every student demonstrate respect for the ideas,
opinions, and feelings of all other members of the class.
Group presentation and project
During the first session students will form teams of two to three
members. Each team will select a topic from a field of interest
which appears to be suitable to be analysed using Formal Concept Analysis.
The students will develop a formal analysis of their
data using the techniques they learn during the semester. In the 11th
session the groups will present their results to the class.
The class presentation will use several representation techniques
learned during the semester and contain an interpretation of the results.
The presentation will last for 20 to 30 minutes.
The groups are recommended to
consult the instructor several times throughout the semester
to clarify questions and to discuss ways of solving problems.
Final Exam
The final exam will be a take-home exam consisting of two (small) data sets to
be modeled with Formal Concept Analysis and two essay questions.
It will be distributed during the 9th session and it will be due at the
beginning of the 12th session. Teamwork is not acceptable for the final
exam.
Academic Dishonesty
Any assignment that contains plagiarized material or indicates any other
form of dishonesty will receive, at a minimum, an automatic grade of F.
A second instance will result in an automatic grade of F for the course.
Class Schedule
Session: 1. Formal Modeling of Data
How can informal data be formally investigated?
Data selection, coding and representation
Qualitative versus quantitative data analysis
Assignment:
Create a list of problems involved in formal data analysis methods. Assign
the problems to the different stages of data analysis (such as data
selection, coding and representation). Try to evaluate the severity of
each problem.
Readings:
- Wolff, Karl Erich (1995)
-
Comparison of Graphical Data Analysis Methods. Proceedings SoftStat'95.
(Hint: It is not necessary to understand the mathematical details
of this paper. Study the "main problems" mentioned for the data
analysis methods.)
- Schoenemann, Peter H. (1994)
- Measurement: The Reasonable Ineffectiveness of Mathematics in the
Social Sciences. In: Borg; Mohler (eds.). Trends and Perspectives
in Empirical Social Research. De Gruyter. pp. 149 - 159.
- de Leeuw, Jan (1994)
- Statistics and the Sciences. In: Borg; Mohler (eds.). Trends and
Perspectives in Empirical Social Research. De Gruyter. pp. 138 - 148.
Optional Readings:
- Gigerenzer, Gerd; Murray, David J. (1987)
- Emergence of Statistical Inference. In: Cognition as Intuitive Statistics.
Lawrence Erlbaum, Hillsdale, New Jersey. pp. 1 - 28.
- Paulos, John Allen (1988)
-
Innumeracy : mathematical illiteracy and its consequences. New York,
Hill and Wang. (Hint: This book is not directly concerned with data
analysis,
but with basic mathematical operations, such as counting and estimating.
The mistakes which Paulos describes are more likely to be found in
newspaper statistics than in scientific research.)
Session: 2. Content Analysis - Linguistic and Computational Methods
Content analysis
Linguistic methods for text (discourse) analysis
The software GABEK
Readings:
- Harris, Mary Dee (1985).
-
Introduction to Natural Language Processing.
pp 55 - 69 and 98 - 104
- Zelger, Josef (1993).
-
A Dialogic Networking Approach to Information Retrieval.
Preprint 24, University of Innsbruck.
- Zelger, Josef (1996).
-
From Verbal Data to Practical Knowledge. In: Gaul;
Pfeifer. From Data to Knowledge. Springer. pp. 458 - 465.
- Groeben, Norbert; Rustemeyer, Ruth (1994).
-
On the Integration of
Quantitative and Qualitative Methodological Paradigms (Based on the
Example of Content Analysis). In: Borg; Mohler (eds.). Trends and Perspectives
in Empirical Social Research. De Gruyter. pp. 308 - 325.
Session (Lab): 3. Formal Concepts and Concept Lattices
Designing formal contexts
Extracting concepts from formal contexts
Readings:
- Wille, Rudolf (1996).
-
Introduction to Formal Concept Analysis. Preprint, TH-Darmstadt. pp 1-4
- Wolff, Karl Erich (1994)
-
A first Course in Formal Concept Analysis. Proceedings SoftStat'93.
Gustav Fischer Verlag. pp 1-5
Session (Lab): 4. Line Diagrams of Concept Lattices
How to draw `nice' line diagrams
Readings:
- Wille, Rudolf (1996).
-
Introduction to Formal Concept Analysis. Preprint, TH-Darmstadt. pp 4-5
Session (Computerlab): 5. ANACONDA and DIAGRAM
The software ANACONDA and DIAGRAM
Drawing lattices with a computer
Readings:
- Luksch; Skorsky; Wille (1985)
-
On Drawing Concept Lattices with a Computer. In: Gaul; Schader (eds.).
Classification as a Tool of Research. North-Holland. pp. 269 - 274.
Session: 6. Facet Theory
Facet theory
Faceted classification systems
Facets as scales in Formal Concept Analysis
Readings:
- Borg, Ingwer (1994).
-
Evolving Notions of Facet Theory. In: Borg; Mohler (eds.). Trends and
Perspectives in Empirical Social Research. De Gruyter. pp. 178 - 200.
- Vickery, Brian C. (1972/1966)
-
Faceted classification schemes. In: A. F. Painter (ed.). Reader in
classification and descriptive cataloguing. NCR Microcard Editions.
pp. 107 - 114.
Session (Lab): 7. Conceptual Scaling
Conceptual Scaling of many-valued contexts
Readings:
- Wolff, Karl Erich (1994)
-
A first Course in Formal Concept Analysis. Proceedings SoftStat'93.
Gustav Fischer Verlag. pp 5-9
- Vellemann, Paul; Wilkinson, Leland (1994)
- Nominal, Ordinal, Interval, and Ration Typologies are
Misleading. In: Borg; Mohler (eds.). Trends and Perspectives
in Empirical Social Research. De Gruyter. pp. 161 - 177.
Optional Reading:
- Ganter, Bernhard; Wille, Rudolf (1989)
-
Conceptual Scaling. In: Roberts (ed.). Applications of combinatorics and
graph theory to the biological and social sciences. Springer, Heidelberg.
Session (Computerlab): 8. Nested Line diagrams and TOSCANA
Nested Line diagrams
Citation order
The software TOSCANA
Readings:
- Wille, Rudolf (1996).
-
Introduction to Formal Concept Analysis. Preprint, TH-Darmstadt. pp 6-7
- Scheich; Skorsky; Vogt; Wachter; Wille (1993).
-
Conceptual Data Systems. In: Opitz; Lausen; Klar (eds.).
Information and Classification. Springer, Berlin-Heidelberg-New York.
- Vogt, Frank; Wille, Rudolf (1995)
-
TOSCANA - A Graphical Tool for Analyzing and Exploring Data. In:
Tamassia; Tollis (eds.). Graph Drawing. Springer, Heidelberg.
- Skorsky, Martin
-
TOSCANA Management System for Conceptual Data.
Available at:
http://www.mathematik.th-darmstadt.de/ags/ag1/software/ToscanaDemo/ToscanaDemo.html
Session: 9. Applications of Formal Concept Analysis
to Library Classification Systems and the Internet
Large scale applications of Formal Concept Analysis
The WAVE and the GRIN project
The ZIT library catalog
Readings:
- Kent, Robert; Neuss, Christian (1995 )
-
Creating a Web Analysis and Visualization Environment.
http://wave.eecs.wsu.edu/WAVE/references.html
- Priss, Uta (1997)
-
A Graphical Interface for Document Retrieval Based on Formal Concept Analysis.
Proc. of the Midwest Artificial Intelligence and Cognitive Science
Conference, May 1997.. (to appear)
Session: 10. Relational Concept Analysis
Quantifiers in semantic relations
Graphical representations of relations
Readings:
- Priss, Uta (1996)
-
Relational Concept Analysis: Semantic Structures in Dictionaries and
Lexical Databases. Dissertation. pp 42 - 47 and 51 - 57
Session: 11. Applications: Student Presentations
Session: 12. Conclusions
Conceptual knowledge systems
Attribute exploration
Knowledge Processing
Readings:
- Wille, Rudolf (1992)
-
Concept Lattices and Conceptual Knowledge Systems. Computers Math. Applic.
Vol. 23. pp. 507 -514.
- Stumme, Gerd (1995)
-
Exploration Tools in Formal Concept Analysis. Preprint 1796, Th-Darmstadt.
- Wille, Rudolf ( ).
-
Conceptual Landscapes of Knowledge: A Pragmatic Paradigm
of Knowledge Processing.
Uta Priss
Fri Feb 21 08:54:31 EST 1997