dr. bhavani thuraisingham

35
Dr. Bhavani Thuraisingham June 2010 Knowledge Management, Semantic Web and Social Networking Introduction to the Semantic Web

Upload: tatiana-fitzgerald

Post on 02-Jan-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Knowledge Management, Semantic Web and Social Networking Introduction to the Semantic Web. Dr. Bhavani Thuraisingham. June 2010. Outline of Part 1. Today’s web to tomorrow’s web Semantic web XML, RDF, Ontologies, OWL Rules Ontology Engineering Vision. Semantic Web: Overview. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dr. Bhavani Thuraisingham

Dr. Bhavani Thuraisingham

June 2010

Knowledge Management, Semantic Web and Social Networking

Introduction to the Semantic Web

Page 2: Dr. Bhavani Thuraisingham

13-204/20/23 05:30

Outline of Part 1

0 Today’s web to tomorrow’s web

0 Semantic web

0 XML, RDF, Ontologies, OWL Rules

0 Ontology Engineering

0 Vision

Page 3: Dr. Bhavani Thuraisingham

13-304/20/23 05:30

Semantic Web: Overview

0 According to Tim Berners Lee, The Semantic Web supports- Machine readable and understandable web pages- Enterprise application integration- Nodes and links that essentially form a very large

database

Premise:

Semantic Web Technologies = XML, RDF, Ontologies, Rules

Applications: Web Database Management, Web Services, Information Integration

Page 4: Dr. Bhavani Thuraisingham

13-404/20/23 05:30

Today’s Web to Semantic web

0 Today’s web- High recall, low precision: Too many web pages resulting

in searches, many not relevant; Sometimes low recall- Results sensitive to vocabulary: Different words even if

they mean the same thing do not results in same web pages; Results are single web pages not linked web pages

0 Semantic web- Machine understandable web pages- Activities on the web such as searching with little or no

human intervention- Solutions to the problems faced by today’s web- Retrieving appropriate web pages, sensitive to vocabulary

Page 5: Dr. Bhavani Thuraisingham

13-504/20/23 05:30

Knowledge Management and Personal Agents

0 Knowledge Management

- Corporation Need: Searching, extracting and maintaining information, uncovering hidden dependencies, viewing information

- Semantic web for knowledge management

=Organizing knowledge, automated tools for maintaining knowledge, question answering, querying multiple documents, controlling access to documents

0 Agents

- John is a president of a company. He needs to have a surgery. With current web he has to check each web page for relevant information, make decisions depending on the information provided

- With the semantic web, the agent will retrieve all the relevant information, synthesize the information, ask John if needed, and then present the various options to John and also makes recommendations

Page 6: Dr. Bhavani Thuraisingham

13-604/20/23 05:30

E-commerce

0 Business to Consumer

- Users shopping on the web; wrapper technology is used to extract information about user preferences etc. and display the products

- Use of semantic web: Develop software agents that can interpret privacy requirements, pricing and product information and display timely and correct information to the use; also provides information about the reputation of shops

- Future: negotiation among the behalf of the user

0 Business to Business

- Organizations work together and carrying out transactions such as collaborating on a product, supply chains etc. With today’s web lack of standards for data exchange

- Use of semantic web: XML is a big improvement, but need to agree on vocabulary. Future will be the use of ontologies to agree on meanings and interpretations

Page 7: Dr. Bhavani Thuraisingham

13-704/20/23 05:30

Some aspects of semantic web

0 Explicit Metadata

- Metadata is data about data; Need metadata to be explicitly specified so that different groups and organizations will know what is on the web

- Using metadata, one can then carry out various activities such as searching, integration and executing actions

- Metadata specification languages include XML, RDF, OWL

0 Semantic web vs Artificial Intelligence

- Goal of Artificial Intelligence is to build an intelligent agent exhibiting human-level intelligence; Goal of the semantic web is to assist the humans in their day to day online activities

0 Logic and Reasoning

- Logic can be used to specify facts as well as rules; New facts and derived from existing facts based on the inference rules; Descriptive Logic is the type of logic that has been developed for semantic web applications

Page 8: Dr. Bhavani Thuraisingham

13-804/20/23 05:30

Layered Approach: Tim Berners Lee’s Visionwww.w3c.org

Page 9: Dr. Bhavani Thuraisingham

13-904/20/23 05:30

Semantic Web and Its Applications

XML, XML Schemas

Rules/Query

Logic, Proof and Trust

RDF, Ontologies

URI, UNICODE

Applications

Tim Berners Lee’s TechnologyStack

Web Services

InformationIntegration

InformationSharing

Page 10: Dr. Bhavani Thuraisingham

13-1004/20/23 05:30

Layered Architecture for Dependable Semantic Web at UTD

0 Some Challenges: Security and Privacy cut across all layers; Integration of Services; Composability

XML, XML Schemas

Rules/Query

Logic, Proof and TrustSECURITY

OtherServices

RDF, Ontologies

URI, UNICODE

PRIVACY

0Adapted from Tim Berners Lee’s description of the Semantic Web

Page 11: Dr. Bhavani Thuraisingham

13-1104/20/23 05:30

What is XML all about?

0 XML is needed due to the limitations of HTML and complexities of SGML

0 It is an extensible markup language specified by the W3C (World Wide Web Consortium)

0 Designed to make the interchange of structured documents over the Internet easier

0 Key to XML used to be Document Type Definitions (DTDs)- Defines the role of each element of text in a formal model

0 XML schemas have now become critical to specify the structure

- XML schemas are also XML documents

Page 12: Dr. Bhavani Thuraisingham

13-1204/20/23 05:30

Example XML Document

Patents

Funds

Year: 2002

Name: U. Of X

ExpensesName:CS

titleAuthorID

Asset report

Assets

Dept

Equipment

news

Patent

Other assets

Grants

Contracts

Page 13: Dr. Bhavani Thuraisingham

13-1304/20/23 05:30

RDF

0 Resource Description Framework is the essence of the semantic web

0 XML cannot be used to specify semantics0 Example:

- Professor is a subclass of Academic Staff- Professor inherits all properties of Academic Staff

0 RDF was specified so that the inadequacies of XML could be handled; RDF uses XML Syntax

0 RDF Concepts- Basic Model

=Resources, Properties and Statements- Container Model

=Bag, Sequence and Alternative

Page 14: Dr. Bhavani Thuraisingham

13-1404/20/23 05:30

Ontology

0 RDF has issues also- Cannot express several other properties such as Union,

Interaction, relationships, etc0 Need a richer language; Ontology languages were developed

by the semantic web community for this purpose0 What are ontologies?

- Common definitions for any entity, person or thing- Several ontologies have been defined and available for

use- Defining common ontology for an entity is a challenge- Mappings have to be developed for multiple ontologies- Specific languages have been developed for ontologies

Page 15: Dr. Bhavani Thuraisingham

13-1504/20/23 05:30

OWL: Background

0 It’s a language for ontologies and relies on RDF0 DARPA (Defense Advanced Research Projects Agency)

developed early language DAML (DARPA Agent Markup Language)

0 Europeans developed OIL (Ontology Interface Language)0 DAML+OIL combines both and was the starting point for OWL0 OWL was developed by W3C0 OWL Features

- Subclass relationship; Class membership; Equivalence of classes

- Consistency (e.g., x is an instance of A, A is a subclass of B, x is not an instance of B)

- Three types of OWL: OWL-Full, OWL-DL, OWL-Lite

Page 16: Dr. Bhavani Thuraisingham

13-1604/20/23 05:30

Why Rules?

0 RDF is built on XML and OWL is built on RDF

0 We can express subclass relationships in RDF; additional relationships can be expressed in OWL

0 However reasoning power is still limited in OWL

0 Therefore the need for rules and subsequently a markup language for rules so that machines can understand

0 Examples: SWRL, RuleML

Page 17: Dr. Bhavani Thuraisingham

13-1704/20/23 05:30

What is Ontology Engineering?0 Tools and Techniques to

- Create Ontologies, Specify Ontologies, Maintain Ontologies, Query Ontologies, Evolve Ontologies, Reuse Ontologies

0 Much of the research is focusing on developing ontologies using tools from multiple heterogeneous data sources

0 Essentially extracting concepts and expanding on concepts from the data sources

0 Uses combination of data integration, metadata extraction, and machine learning techniques

0 E.g. Clustering of concepts, Classification of concepts etc.

Page 18: Dr. Bhavani Thuraisingham

13-1804/20/23 05:30

Vision

0 Semantic Web technologies represent and reason about the data on the web

- Databases, Weblogs, Blogs, Chats, FOAF, Images, Video, etc.

0 Social Networks are extracted from semantic web data using reasoning and data mining

0 Social network analysis analyzes social networks using data mining and other reasoning techniques and extracts nuggets

0 The nuggets are used for effective knowledge management

Page 19: Dr. Bhavani Thuraisingham

13-1904/20/23 05:30

Outline of Part II

0 This unit describes the relationship between Social Networks and Semantic Web

0 FOAF0 LINK (Peter Mika, Free University)0 Extracting social networks from Semantic Web Data

(Tim Finin et al, UMBC, Jennifer Golbeck UMC)0 Convergence and Vision0 Reference: P. Mika, Semantic Web and Social Networks,

Springer, 2008

Page 20: Dr. Bhavani Thuraisingham

13-2004/20/23 05:30

Semantic Social Networks

0 The latest breed of social networking services combine social networks with the sharing of content such as bookmarks, documents, photos, reviews.

0 The use of of Semantic Web technology facilitated distributed control.

- The friend-of-a-friend (FOAF) project is a first attempt at a formal, machine processable representation of user profiles and friendship networks. (Unlike with Friendster and similar sites that have central control)

- FOAF profiles are created and controlled by the individual user and shared in a distributed fashion.

- http://www.foaf-project.org.

Page 21: Dr. Bhavani Thuraisingham

13-2104/20/23 05:30

FOAF

0 The Friend of a Friend (FOAF) project is creating a Web of machine-readable pages describing people, the links between them and the things they create and do; it is a contribution to the linked information system known as the Web.

0 FOAF defines an open, decentralized technology for connecting social Web sites, and the people they describe.

0 FOAF is part of a shift towards a Web where we can choose the sites and tools we like, without being cut off from friends who made different choices.

0 FOAF lets you share and inter-connect information from diverse sources, move it around, and use it in unexpected new ways.

Sharif University of Technology,

Semantic Web Course, Fall 2005

Page 22: Dr. Bhavani Thuraisingham

13-2204/20/23 05:30

FOAF Example

0 <foaf:Person rdf:about="#me“ xmlns:foaf="http://xmlns.com/foaf/0.1/">

<foaf:name>Dan Brickley</foaf:name>

<foaf:mbox_sha1sum>241021fb0e6289f92815fc210f9e9137262c252e</foaf:mbox_sha1sum>

<foaf:homepage rdf:resource="http://danbri.org/" /> <foaf:img rdf:resource="/images/me.jpg" />

</foaf:Person>

Page 23: Dr. Bhavani Thuraisingham

13-2304/20/23 05:30

Semantic Social Networks

Semantic Web researchers and their connections across the globe.

Page 24: Dr. Bhavani Thuraisingham

13-2404/20/23 05:30

Semantic Social Networks

SocialNetwork of a SemanticWeb Researcher

Page 25: Dr. Bhavani Thuraisingham

13-2504/20/23 05:30

FLINK (Peter Mika, Free University)

0 Flink, the system developed at Free University 9The Netherlands) is one of the early semantic social networks that exploits FOAF for the purposes of social intelligence.

- social intelligence, is consdiered to be the semantics-based integration and analysis of social knowledge extracted from electronic sources under diverse ownership or control. In our case, these sourcesFrom

0 Flink extracts knowledge about the social networks of the community and consolidates what is learned using a common semantic representation, namely the FOAF

Page 26: Dr. Bhavani Thuraisingham

13-2604/20/23 05:30

FLINK Architecture

ArchitectureOf Flink

Page 27: Dr. Bhavani Thuraisingham

13-2704/20/23 05:30

FLINK Architecture0 The architecture of Flink can be divided in three layers concerned with metadata

acquisition, storage and visualization

0 Acquisition layer of the system concerns the acquisition of metadata. (e.g., HTML

pages from the web, FOAF profiles from the Semantic Web, public collections of emails

and bibliographic data)

0 The web mining component of Flink employs a co-occurrence analysis technique The

web mining component also performs the additional task of finding topic interests, i.e.

associating researchers with certain areas of research.

0 The middle layer is responsible for storing and enhancing metadata through reasoning.

0 Inference is another major task of the middle layer. Sesame (we can also use JENA)

applies the RDF closure rules to the data at upload time. This feature can be extended

by defining domain-specific inference rules in Sesame’s custom rule language.

0 The third layer, is the browing and visualization layer,. The user interface of Flink is a

pure Java web application based on the Model-View-Controller (MVC) paradigm.

Page 28: Dr. Bhavani Thuraisingham

13-2804/20/23 05:30

Social Network Analysis on Semantic Web Data

0 Social network analysis tasks for Flink augments the web mining task with finding which people belong to which groups (called GROUP DETECTION)

0 The association and links between people including what is the relationship between John and James? Are they just friends or do they have a romantic relationship? Do they often travel together?

0 Semantic web reasoning tools (e.g., based on OWL, RDF and SWRL) may be used to reason and extract the nuggets.

Page 29: Dr. Bhavani Thuraisingham

13-2904/20/23 05:30

Group Detection

0 A large community often breaks up to a set of closely knit groups of

individuals, woven together more loosely by the occasional interaction across groups.

Based on this theory, SNA offers a number of clustering algorithms for identifying communities based on network data. Alternatively, the subgroups may be identified by the researcher using additional attribute data on the

Peter Mika’s research uses an interactive clustering software

provided as a sample with the JUNG Java toolkit for SNA. This software allows the user to cluster a network using an edge-betweenness clusterand visualize the results.

As an example, a group of researchers from the AIFB Institute of the University of Karlsruhe quickly emerge as a single cluster of the network.

Page 30: Dr. Bhavani Thuraisingham

13-3004/20/23 05:30

Linking Social Networks with FOAF

0 One of the core goals of the Semantic Web is to store data in distributed locations, and use ontologies and reasoning to aggregate it.

0 Social networking is a large movement on the web, and social networking data using the Friend of a Friend (FOAF) vocabulary makes up a significant portion of all data on the Semantic Web.

0 Many traditional web-based social networks share their members’ information in FOAF format.

0 While this is by far the largest source of FOAF online, there is no information about whether the social network models from each network overlap to create a larger unified social network model, or whether they are simply isolated components.

0 Researchers at the U of MD have studied the intersection of FOAF data found in many online social networks. Using the semantics of the FOAF ontology and applying Semantic Web reasoning techniques, they show that a significant percentage of profiles can be merged from multiple networks.

Page 31: Dr. Bhavani Thuraisingham

13-3104/20/23 05:30

Extracting Social Networks

0 Extracting social network from noisy, real world data is a challenging task, even if the information is already encoded in RDF using well defined ontologies.

0 The process consists of three steps: discovering instances of foaf:Person, merging information about unique individuals, and linking person through various social relation properties such as foaf:knows.

Page 32: Dr. Bhavani Thuraisingham

13-3204/20/23 05:30

Extracting Social Networks (Tim Finin)

0 A critical problem is determining whether two foaf:Person instances denote the same person. The semantics of FOAF vocabulary suggests several heuristics to answer this question:

- • named URI. Non-anonymous individuals using the same URI denote the same person.

- • Inverse-functional properties. Inverse functional properties such as foaf:mbox and foaf:homepage identify unique individuals. Other properties, such as foaf:name and foaf:nick, while not strictly inverse functional, can be used in practice in conjunction with other properties like foaf:phone to identify individuals with high probability.

- Semantic equality. When two or more values of an inverse functional property co-exist in the same individual’s description, they are semantically equivalent as identifying the same individual.

0 \

Page 33: Dr. Bhavani Thuraisingham

13-3304/20/23 05:30

Convergence

0 Semantic web data includes databases, files, web logs, blogs, emails, etc.

0 Data mining applied to semantic web data together with the reasoning capabilities of semantic web result in social networks

0 Data mining applied to social networks extract the nuggets 0 Nuggets together with additional semantic web data such as

ontologies result in knowledge0 Knowledge utilized to improve the effectiveness of an

organization

Page 34: Dr. Bhavani Thuraisingham

13-3404/20/23 05:30

Convergence

Data Management/Data Mining/Data Analytics

Semantic WebData/ReasoningXML, RDF, OWLe.g., databasesBlogs, email

Social Networks/Analysis

KnowledgeManagement

Page 35: Dr. Bhavani Thuraisingham

13-3504/20/23 05:30

Vision

0 Improved technologies for data representation- Data will include structured and unstructured databases,

emails, blogs, files, relationships, video, images, audio, tags, links, - - - - -

0 Improved tools for reasoning0 Improved tools for data mining/data analytics0 Improved tools for social network extraction0 Improved tools for knowledge extraction0 Improved tools for knowledge management0 We call the above Information Analytics