Building Trustworthy Semantic Webs
Recent developments in information systems technologies have resulted in computerizing many applications in various
business areas. Data has become a critical resource in many organizations, and therefore, efficient access to data, sharing
the data, extracting information from the data, and making use of the information has become an urgent need. As a result,
there have been many efforts on not only integrating the various data sources scattered across several sites, but
extracting information from these databases in the form of patterns and trends has also become important. These data
sources may be databases managed by database management systems, or they could be data warehoused in a repository from
multiple data sources.
The advent of the World Wide Web (WWW) in the mid-1990s resulted in even greater demand for managing data, information and
knowledge effectively. There is now so much data on the Web that managing it with conventional tools is becoming almost
impossible. New tools and techniques are needed to effectively manage this data. Therefore, to provide interoperability as
well as to ensure machine understandable Web pages, the concept of semantic Web was conceived by Tim Berners-Lee, who heads
the World Wide Web Consortium (W3C).
As the demand for data and information management increases, there is also a critical need for maintaining the security of
the databases, applications and information systems. Data and information have to be protected from unauthorized access as
well as from malicious corruption. With the advent of the Web, it is even more important to protect the data and
information as numerous individuals now have access to this data and information. Therefore, we need effective mechanisms
to secure the semantic Web technologies.
Building Trustworthy Semantic Webs reviews the developments in semantic Web technologies and describes ways of securing
these technologies. The focus is on confidentiality, privacy, trust, and integrity management for the semantic Web. We call
such a semantic Web a trustworthy semantic Web. We discuss applications of trustworthy semantic Webs in secure Web
services, secure interoperability, secure knowledge management, secure e-business and secure information sharing.
Developments and Directions for Trustworthy Semantic Webs
As stated by Berners-Lee, the semantic Web consists of a collection of technologies that enable machine understandable
Web pages. The idea is for agents acting on behalf of users to collaborate with one another, invoke Web services,
understand Web pages and carry out activities such as making airline reservations, planning for a surgery or designing a
vehicle. The technologies that consist of the semantic web include markup languages such as Extensible Markup
Language (XML), semantics based languages such as Resource Description Framework (RDF) and ontology languages such as
Web Ontology Language (OWL). Agents use these technologies, negotiate contracts with each other and carry out
activities. In order to ensure the security of operation, the semantic Web needs to enforce policies for confidentiality,
privacy, trust, and integrity among others. That is, policies specify the types of access that agents have to the Web
resources and also the extent to which the agents trust one another. In order to carry out negotiation, various inferencing
systems have been developed. While numerous developments have been reported on semantic Web technologies, it is only
recently that security is getting some attention. Therefore, one of the major directions for the semantic Web is to ensure
the security of operation. We discuss some of the security issues in the next few paragraphs.
Consider the XML layer of the semantic web. One needs secure XML. That is, access must be controlled to various portions of
the document for reading, browsing and modifications. There is research on securing XML and XML schemas. The next step is
securing RDF. Now with RDF not only do we need secure XML, we also need security for the interpretations and semantics. For
example, under certain contexts, portions of a document may be Unclassified while under certain other contexts the document
may be Classified. As an example, one could declassify an RDF document once the war is over.
Once XML and RDF have been secured, the next step is to examine security for ontologies. That is, ontologies may have
security levels attached to them. Certain parts of the ontologies could be Secret while certain other parts may be
Unclassified. The challenge is, how does one use these ontologies for applications such as secure information integration?
Researchers have done some work on the secure interoperability of databases and the use of ontologies is being explored.
We also need to examine the inference problem for the semantic Web. Inference is the process of posing queries and
deducing new information. It becomes a problem when the deduced information is something the user is unauthorized to
know. With the semantic Web, and especially with data mining tools, one can make all kinds of inferences. Recently there
has been some research on controlling unauthorized inferences on the semantic Web.
Security should not be an afterthought. We have often heard that one needs to insert security into the system right from
the beginning. Similarly security cannot be an afterthought for the semantic Web. However, we cannot also make the system
inefficient if we must guarantee one hundred percent security at all times. What is needed is a flexible security
policy. During some situations, we may need one hundred percent security while during some other situations some
security, e.g., 60 percent, may be sufficient.
Closely related to security is privacy. The challenge here is protecting sensitive information about the individuals. Other
challenges include trust management and negotiation. How do we determine the trust that agents place on one another? Is it
based on the reputation of the agents?
Another challenge is maintaining integrity. For example, when XML documents are published by third parties, we need to
ensure that the documents are authentic and are of high quality. As more progress is made on investigating these various
issues, we hope that appropriate standards would be developed for securing the semantic web. Note that while security is
essentially about confidentiality, we use the term trustworthiness to include not only confidentiality, but also privacy,
trust and integrity.
We have written a series of book for CRC Press on data management, data mining and data security. The first book,
Data Management Systems: Evolution and Interoperation, focuses on general aspects of data management and also addressed
interoperability and migration.
Data Mining: Technologies, Techniques, Tools and Trends, the second book, discusses data mining. It essentially elaborates on sections of
Data Management Systems: Evolution and Interoperation. The third book,
Web Data Management and E-Commerce, discusses Web database technologies and e-commerce as an application area. It, too,
elaborates on sections of Data Management Systems Evolution and Interoperation.
The fourth book in the series,
Managing and Mining Multimedia Databases, addresses both multimedia database management and
multimedia data mining. It elaborates on both sections of
Data Management Systems: Evolution and Interoperation and
Data Mining: Technologies, Techniques, Tools and Trends.
XML Databases and the Semantic Web, the fifth book, describes XML technologies related to data management. It expands on parts of
Web Data Management and E-Commerce.
Web Data Mining Technologies and Their Applications in Business Intelligence and Counter-terrorism builds from
Web Data Management and E-Commerce.
Database and Applications Security: Integrating Information Security and Data Management is the seventh book in the series. It examines security for technologies discussed in
each of our previous books. It focuses on the technological developments in database and applications security. It is
essentially the integration of information security and database technologies. One can regard
Database and Applications
Security as the start of a new series in data security. The latest book,
Building Trustworthy Semantic Webs builds from
Database and Applications Security. It also integrates security with
XML Databases and the Semantic Web.
An ontology is a data model that represents a set of concepts within a domain and the relationships between those
concepts. It is used to reason about the objects within that domain.
Dr. Bhavani Thuraisingham joined The University of Texas at Dallas (UTD) in October 2004 as a Professor of Computer
Science and Director of the Cyber Security Research Center in the Erik Jonsson School of Engineering and Computer
Science. She is an elected Fellow of three professional organizations: the IEEE (Institute for Electrical and Electronics
Engineers), the AAAS (American Association for the Advancement of Science) and the BCS (British Computer Society) for
her work in data security. She received the IEEE Computer Society’s prestigious 1997 Technical Achievement Award for
"outstanding and innovative contributions to secure data management." Prior to joining UTD, Dr. Thuraisingham was an
IPA (Intergovernmental Personnel Act) at the National Science Foundation (NSF) in Arlington VA, from the MITRE
Corporation. At NSF, she established the Data and Applications Security Program and co-founded the Cyber Trust theme
and was involved in inter-agency activities in data mining for counter-terrorism. She worked at MITRE in Bedford, MA
between January 1989 and September 2001 first in the Information Security Center and was later a department head in
Data and Information Management as well as Chief Scientist in Data Management in the Intelligence and Air Force
centers. She has served as an expert consultant in information security and data management to the Department of
Defense, the Department of Treasury and the Intelligence Community for over 10 years. Thuraisingham’s industry
experience includes six years of research and development at Control Data Corp. and Honeywell Inc.
Books by Bhavani Thursaisingham
1.
Data Management Systems: Evolution and Interoperation
2.
Data Mining: Technologies, Techniques, Tools and Trends
3.
Web Data Management and E-Commerce,
4.
Managing and Mining Multimedia Databases
5.
XML Databases and the Semantic Web
6.
Web Data Mining Technologies and Their Applications in Business Intelligence and Counter-terrorism
7.
Database and Applications Security: Integrating Information Security and Data Management
8.
Building Trustworthy Semantic Webs
© Copyright 2007 Skaistis Consulting