University of Southampton

We are delighted to announce the first Web Science Summer School, to be held in Southampton, UK from July 20th to July 25th, 2014. Under the motto 'The age of data' the summer school will provide comprehensive education and networking opportunities to the next generation of Web scientists.

The school will feature a mixture of invited talks and tutorials on several aspects of data science, including data analytics, data publishing and interlinking, social media management, open data, and crowdsourcing, as well as group projects, in which the participants will have the chance to apply what they've learned by developing their own data science research ideas as part of a team.

We will also create opportunities for attendees to network with their peers, the school tutors, and the invited speakers and receive valuable advice from senior researchers during a student poster session.


  • Elena Simperl, University of Southampton
  • Claire Wyatt, University of Southampton
  • Wendy Hall, University of Southampton
  • Leslie Carr, University of Southampton


Sunday 20th July

19.00Reception - The Cowherds in Southampton, The Common
Attendees can make their own way to The Cowherds, or wait to be picked up by Southampton Web Science PhD students at Glen Eyre Halls at 18:30

Monday 21st July

08:15Registration - 67/1003
09:00Opening - 67/1003
09:15Keynote: Wendy Hall - 67/1027
10:15Coffee Break - 67/1003
10:45Tutorial: Introduction to data science and analytics (Claudia Wagner) - 67/1003
12:30Lunch break (Packed Lunch) - Outside or in B39/40
14:00Tutorial: Data cleansing and visualisation (Max van Kleek) - 67/1003
15:00Coffee Break - 67/1003
15:30Hands-on: Basic tools and programming kit - 67/1003
18:00Dinner - 32/3077
19:00Poster Session (Until 8pm) - 32/3077

Tuesday 22nd July

09:00Administrative issues - 67/E1001
09:15Keynote: Jim Hendler - 45/2040 L/R A
10:15Coffee break - 67/E1001
10:45Tutorial: Data publishing and interlinking (Barry Norton) - 67/E1001
12:30Lunch break (Packed Lunch) - Outside or in B39/40
13:30Keynote: Nigel Shadbolt - 45/2040
14:30Tutorial: Web Observatory (Ramine Tinati) - 67/E1001
15:30Coffee break - 67/E1001
16:00Hands-on: Data publishing and interlinking - 67/E1001
18:00Dinner - 32/3077
19:00Poster session (Until 8pm) - 32/3077

Wednesday 23rd July

09:00Administrative issues - 67/E1001
09:15Tutorial: Crowdsourcing (Lora Aroyo) - 67/E1001
10:15Coffee break - 67/E1001
10:45Tutorial: Crowdsourcing (Lora Aroyo) - 67/E1001
11:30Hands-on: MTurk - 67/E1001
12:30Lunch break (Packed Lunch) - Outside or in B39/40
13.30Keynote: Cory Doctorow - 67/E1007
15:00Excursion - Winchester - Meet at the Interchange
20:00Dinner (Part of excursion)

Thursday 24th July

09:00Administrative issues - 67/1027
09:15Keynote: Chris Welty - 67/1027
10:15Coffee break - 67/1027
10:45Student project work - Separate Workrooms (see below)
12:30Lunch break (Packed Lunch)
14:00Student project work - Separate Workrooms (see below)
20:00Dinner - The Cowherds in Southampton, The Common

Work rooms

1.67/1015 T Rm 1
2.32/4049 T Rm
3.32/4005 T Rm
4.32/3005 T Rm
5.32/3049 T Rm

Friday 25th July

09:00Administrative issues - 67/1027
09:15Keynote: Guus Schreiber - 67/1027
10:15Coffee break - 67/1027
10:45Student project work - Separate Workrooms (see below)
12:30Lunch break (Packed Lunch) - Outside or in B39/40
14:00Student project work - Separate Workrooms (see below)
20:00Dinner - Pitcher & Piano, Ocean Village

Work rooms

1.67/1013 T Rm 2
2.32/4049 T Rm
3.32/4005 T Rm

Saturday 26th July

09:00Project presentations - 32/3077
12:00Break & Judges' deliberation - 32/3077
12:30Awards & Closing - 32/3077

Map and programme

A full programme, including a complete schedule of the week, has been included in your welcome pack. You can additionally download a PDF of this programme by clicking on this link. A map of the University of Southampton Highfield Campus can be found using the following link - Map


Monday 21st July: Wendy Hall

Wendy Hall

Professor Dame Wendy Hall is Dean of the Faculty of Physical Sciences and Engineering, and is Professor of Computer Science at the University of Southampton. She is perhaps best known for her pioneering work in the area of Web Science. Through several significant roles of leadership and management, she has been instrumental in shaping the agenda for Engineering policy and education, and her work has earned recognition as Dame Commander of the British Empire in the 2009 UK New Year's Honours list, and as a Fellow of the Royal Society in June 2009.

Wendy's research interests cover a broad set of issues within the areas of multimedia and hypermedia. She has particular involvement in the novel challenges embedded within Digital Libraries and the Semantic web. Involvement with a platform grant (Sociam) embraces the desire to create social systems on the web - social machines - that are efficient, interrogatable, and capable of solving complex problems at a large scale.


Presentation: Observing the Web

Tuesday 22nd July: Nigel Shadbolt

Nigel Shadbolt

The Fifth Paradigm: Open Innovation and Open Data

We are living in an age of superabundant information. The Internet and World Wide Web have been the agents of this revolution. This deluge of information and data has led to a range of scientific discoveries and engineering innovations. Data at Web scale has allowed us to characterise the shape and structure of the Web itself and to efficiently search its billions of items of contents. Data published on the Web has enabled the mobilisation of hundreds of thousands of humans to solve problems beyond any individual or single organisation. Open government data published on the Web is improving the efficiency and accountability of our public services. Open data is giving rise to open innovation that is generating social, environmental and economic value. Data science is emerging as an area of competitive advantage for individuals, companies, public and private sector organisations and nation states. A Web of data offers new opportunities and challenges for science, government and business. This lecture will discuss these fast moving developments and how they will impact our lives.


Professor Sir Nigel Shadbolt is an academic and commentator who studies and writes about open data, artificial intelligence, computer and web science. During his 33 year career, he has also worked in philosophy, psychology and linguistics. In 2013 he was knighted in the Queen’s Birthday Honours “for services to science and engineering”.

Today, Nigel draws together this multidisciplinary expertise to focus on understanding how the web is evolving and changing society. He is passionate about how humans and computers can solve problems together at web scale.

Presentation: Fifth Paradigm: Open Innovation and Open Data

Tuesday 22nd July: Jim Hendler

Jim Hendler

Keynote - Broad Data: Challenges on the emerging Web of data

Big Data" usually refers to the very large datasets generated by scientists, to the many petabytes of data held by companies like Facebook and Google, and to analyzing real-time data assets like the stream of twitter messages emerging from events around the world. Key areas of interest include technologies to manage much larger datasets, technologies for the visualization and analysis of databases, cloud-based data management and datamining algorithms.

Recently, however, we have begun to see the emergence of another, and equally compelling data challenge -- that of the "Broad data" that emerges from millions and millions of raw datasets available on the World Wide Web. For broad data the new challenges that emerge include Web-scale data search and discovery, rapid and potentially ad hoc integration of datasets, visualization and analysis of only-partially modeled datasets, and issues relating to the policies for data use, reuse and combination. In this talk, we present the broad data challenge and discuss potential starting points for solutions including those arising from research in the Semantic Web area. We illustrate these approaches using data from a "meta-catalog" of over 1,000,000 open datasets that have been collected from about two hundred governments around the world.


Professor James Hendler is the Director of the Institute for Data Exploration and Applications and the Tetherless World Professor of Computer, Web and Cognitive Sciences at Rensselaer Polytechnic Institute. He also serves as a Director of the UK's charitable Web Science Trust. He has authored over 200 technical papers in the areas of Semantic Web, artificial intelligence, agent-based computing and high performance processing. One of the originators of the "Semantic Web", Hendler was the recipient of a 1995 Fulbright Foundation Fellowship, is a Fellow of the American Association for Artificial Intelligence, the British Computer Society, the IEEE and the AAAS. In 2010, Hendler was named one of the 20 most innovative professors in America by Playboy magazine and was selected as an "Internet Web Expert" by the US government. In 2012, he was one of the inaugural recipients of the Strata Conference "Big Data" awards for his work on large-scale open government data, and he is a columnist and associate editor of the Big Data journal. In 2013, he was appointed Open Data Advisor to New York State by Governor Cuomo.

Presentation: Semantic Web: The Inside Story

Wednesday 23rd July: Cory Doctorow

Cory Doctorow

Keynote - Huxleying Our Way into the Full Orwell: how streaming media and related delusions threaten privacy, free speech, and democracy


Cory Efram Doctorow is a Canadian-British blogger, journalist, and science fiction author who serves as co-editor of the blog Boing Boing. He is an activist in favour of liberalising copyright laws and a proponent of the Creative Commons organization, using some of their licenses for his books. Some common themes of his work include digital rights management, file sharing, and post-scarcity economics. — Wikipedia

Presentation: How streaming media and related delusions threaten privacy

Thursday 24th July: Chris Welty

Chris Welty

Keynote - Semantic Technology in Watson

Watson is a computer system capable of answering rich natural language questions and estimating its confidence in those answers at a level rivalling the best humans at the task. On Feb 14-16, 2011, in an historic event, Watson triumphed, in a widely televised broadcast of the American quiz show Jeopardy!, over the best human players of all time. In this talk I will discuss how Watson works at a high level with examples from the show, and concentrate on the use of semantic technology in Watson.


Chris Welty is a Research Scientist at the IBM T.J. Watson Research Center in New York. Previously, he taught Computer Science at Vassar College, taught at and received his Ph.D. from Rensselaer Polytechnice Institute, and accumulated over 14 years of teaching experience before moving to industrial research. Chris' principal area of research is Knowledge Representation, specifically ontologies and the semantic web, and he spends most of his time applying this technology to Natural Language Question Answering as a member of the DeepQA/Watson team and, in the past, Software Engineering. Dr. Welty is a co-chair of the W3C Rules Interchange Format Working Group (RIF), serves on the steering committee of the Formal Ontology in Information Systems Conferences, is president of KR.ORG, on the editorial boards of AI Magazine, The Journal of Applied Ontology, and The Journal of Web Semantics, and was an editor in the W3C Web Ontology Working Group. While on sabbatical in 2000, he co-developed the OntoClean methodology with Nicola Guarino. Chris Welty's work on ontologies and ontology methodology has appeared in CACM, and numerous other publications.

Presentation: Inside the mind of Watson

Friday 25th July: Guus Schreiber

Guus Schreiber

Keynote - Knowledge Engineering and the Web

The Web can be viewed as a vehicle for knowledge democracy. Several technologies have been developed to support knowledge transfer via the Web, including languages like RDF, OWL and SKOS. We discuss the effectivity of these technologies, as well as methods and techniques that can be used for practical Web-based knowledge engineering. We illustrate this with application example in the domain of digital heritage collections.


Guus Schreiber is a professor of Intelligent Information Systems at the Department of Computer Science department of the VU University Amsterdam. His research interests are mainly in knowledge and ontology engineering, with a special interest for applications in the field of cultural heritage. He was one of the key developers of the CommonKADS methodology. He acts as chair of W3C groups for Semantic Web standards such as RDF, OWL, SKOS and RDFa. His research group is involved a wide range of national and international research projects. He is now project coordinator of the EU Integrated Project NoTube concerned with integration of Web and TV data with the help of semantics and was previously Sceintific Director of the EU Network of Excellence "Knowledge Web".

Schreiber studied medicine at the University of Utrecht. After working two years at the University of Leiden in the Medical Informatics department he joined in 1986 the SWI (Social Science Informatics) group of Bob Wielinga at the University of Amsterdam, where he was involved in research on knowledge engineering. In 1992 he was awarded a Ph.D. on a thesis entitled "Pragmatics of the Knowledge Level". In 2003 he moved to the VU.

Presentation: Knowledge engineering and the web


Lora Aroyo

Lora Aroyo

Lora Aroyo is an associate professor at the Web & Media Group, Department of Computer Science, VU University Amsterdam, The Netherlands. Her research work is focused on semantic technologies for modeling user and context for recommendation systems and personalized access of online multimedia collections, e.g. cultural heritage collections, multimedia archives and interactive TV. She was a scientific coordinator of the NoTube project, dealing with the integration of Web and TV data with the help of semantics, and a number of nationally funded projects, such as CHIP and Agora, dealing with modelling events and event narratives. She has been co-chair of numerous workshops on crowdsourcing, social web and cultural heritage. Lora is actively involved in the Semantic Web community as a program chair for the European and the International Semantic Web Conferences in 2009 and 2011, as conference chair for the ESWC 2010 Conference, and the editorial board of the Semantic Web Journal. She is also actively involved in the Personalization and User modeling community as vice-president of the User Modeling Inc., and the editorial board of the Journal of Human-Computer Studies and the User Modeling and User-Adapted Interaction Journal. In 2012 and 2013 she won IBM Faculty Awards for her work on Crowd Truth: Crowdsourcing for ground truth data collection for adapting IBM Watson system to medical domain.


Presentation: Crowdsourcing

Downloadable Resources

Barry Norton

Barry Norton

Dr. Barry Norton is a Development Manager at the British Museum, responsible for the architecture and project management of ResearchSpace, a Linked Data platform for the arts and cultural heritage domain. Previously, as Solutions Architect for Ontotext, he provided consultancy on this and many other large-scale Linked Data and text analytics solutions, and training on semantic technologies for the cultural heritage, media and finance domains. He holds a PhD from the University of Sheffield. He worked on semantics-based national and European-funded research projects there and at the Open University, the University of Innsbruck, Karlsruhe Institute of Technology and Queen Mary University of London. He is the author of over fifty peer-reviewed publications in the area, including three book chapters.

Presentation: Data Publishing and Interlinking

Downloadable Resources

Max van Kleek

Max van Kleek

Max arrived at ECS Southampton in February 2010 after studying at MIT CSAIL for 10 years as a Ph.D. student in the Haystack Group under Prof. David Karger. Prior to CSAIL he worked at the MIT Media Lab, doing interaction design under Prof. John Maeda and the Aesthetics and Computation Group, and focused on AI algorithms for personal information visualisation and user modelling.

During his studies there he created List-it: an open source note taking tool, Atomate: a personal reactive assistant for the web, eyebrowse: a way to share your web browsing activities in real time, and Poyozo: an automatic personal diary.

Presentation: Data Cleansing and Visualisation

Downloadable Resources

Ramine Tinati

Ramine Tinati

Ramine is a research fellow for the SOCIAM research project at the University of Southampton. In 2013 he gained his PhD in Web Science investigating novel ways to combine quantitative and qualitative approaches to better understand the development of the Web. Since working as a research fellow, Ramine has been developing analytics and workflows to support real-time and historic analysis of Web systems.

This research which feeds into the Web Observatory project and theme within SOCIAM includes investigating current big data and stream processing techniques, as well as appropriate solutions for distributed storage and analysis. Amongst the various analytical approaches developed to simplify and better understand extremely large datasets, serious effort is being put into developing methodologies that bridge the gap between the quantitative and qualitative divide.

Presentation: Web Observatory

Downloadable Resources

Claudia Wagner

Claudia Wagner

Claudia is a post doctoral researcher at the Computational Social Science department at GESIS - Leibniz Institute for the Social Sciences and an adjunct lecturer at the University of Koblenz-Landau. Her research interests include computational social science, data science, data mining and natural language processing.

Claudia received her PhD from Graz University of Technology for her research on emergent structures, usage and semantics of social streams. During her PhD was interning at HP labs (US) and Xerox PARC (US) and the Open University (UK).

Presentation: Introduction to Data Science and Analytics

Downloadable Resources


Key Dates

Closing date for application: June 20th, 2014
Notification of acceptance: June 23rd, 2014
Deadline for registration/payment: July 4th, 2014


The registration fee is £550 (€670)

The registration fee covers meals (from Monday morning, July 21st to Saturday lunch time, July 26th, 2014) and board (6 nights from Sunday, July 20th to Saturday, July 26th, 2014), as well as meeting facilities, learning materials, poster sessions, and an excursion.


In particular, the Web Science Summer School targets Masters and PhD students, as well as other junior researchers from various discplines who are interested in learning more about the scientific methods and tools that are needed to process and make sense of the increasingly large and diverse amounts of data that are generated and used in virtually any type of organization today.

More Information

If you would like more information, or would like to modify or withdraw your current application, please contact Claire Wyatt on or telephone: +44 (0)23 8059 2738


Payment should be made through the University of Southampton online store.

Joining Instructions

We have put together a handy booklet for summer school attendees which contains detailed maps, accomodation details and transport information for quick reference (click to download).

Transport to Southampton

The joining booklet contains instructions on how to get to Southampton once you are in the UK. In addition, you can take the National Express coach from your airport directly to the University. The bus stop for the University is known as ‘The Interchange’ on Highfield Campus. Once you arrive on the Highfield campus at The Interchange stop, see the directions below on how to get to the accommodation.

You can also take a train from London to get to Southampton. We have two Southampton train stations, Southampton Central or Southampton Airport Parkway. The University has a bus service called ‘Uni-link’ that can bring you to The Interchange on Highfield Campus from either of those stations. One way fare is £2. Instructions for travelling by car are in the conference joining instructions.


Your breakfast is included in your accommodation fee and you will be given vouchers to use in the ‘Piazza’ (a restaurant) which is on the Highfield Campus. You will be given a map in your welcome pack (on Sunday night) that will show you how to get to the Piazza from Glen Eyre Halls on foot. We suggest that when you leave the halls for breakfast, you bring everything you need for the day because there won’t be time to go back to your accommodation before the all the tutorials etc start.

For breakfast on Saturday, you won’t be given a voucher because ‘The Terrace’ Restaurant in Building 38 will be open for you all (so a voucher will not be required).

Directions to Glen Eyre Halls

Once you arrive on Highfield campus, you will need to walk to the Glen Eyre halls. On the map, the Uni Link bus stop (known as 'The Interchange') is located next to building 6. This is where you will need to depart if you have travelled by bus or the national express coach. If you take a taxi from the stations or airport, it is probably best to ask the taxi driver to drop you directly at the Glen Eyre Halls reception.

When you arrive at the Halls of Residence, you will need to go to the 24-hour Reception to collect your room key, access code and Welcome Pack. Check in time is usually from 15:00 (unless special arrangements have been made) and check out time is 09:30 am

Sunday 20th July – Visit to The Cowherds Pub

Claire Wyatt (plues several students) will meet you at the Glen Eyre reception to walk you to The Cowherds pub at 18:30. At the pub, you will recieve a summer school welcome pack that will contain your name badge, door access card, maps and breakfast vouchers.

Accommodation & Excursion

Glen Eyre Halls

Accommodation for attendees will be provided within the Glen Eyre complex, which is situated just a 10 minute walk away from Highfield Campus (see map). Attendees will be sleeping in single occupancy en-suite rooms.

Breakfast Arrangements

No food will be available within the Glen Eyre complex (which will be closed for summer) and all attendees will need to walk to the Piazza eatery on Highfield Campus for breakfast. Attendees will be provided with breakfast vouchers which they can redeem at the Piazza eatery.

More information about the Piazza eatery (including menus and a map) can be found with the following link: Piazza


The excursion to England's historic capital city, Winchester, will take place on Wednesday. It will include a guided tour of Winchester Cathedral (the longest Gothic cathedral in Europe!) and dinner.