-
Directed Diversity: Leveraging Language Embedding Distances for Collective Creativity in Crowd Ideation
Authors:
Samuel Rhys Cox,
Yunlong Wang,
Ashraf Abdul,
Christian von der Weth,
Brian Y. Lim
Abstract:
Crowdsourcing can collect many diverse ideas by prompting ideators individually, but this can generate redundant ideas. Prior methods reduce redundancy by presenting peers' ideas or peer-proposed prompts, but these require much human coordination. We introduce Directed Diversity, an automatic prompt selection approach that leverages language model embedding distances to maximize diversity. Ideator…
▽ More
Crowdsourcing can collect many diverse ideas by prompting ideators individually, but this can generate redundant ideas. Prior methods reduce redundancy by presenting peers' ideas or peer-proposed prompts, but these require much human coordination. We introduce Directed Diversity, an automatic prompt selection approach that leverages language model embedding distances to maximize diversity. Ideators can be directed towards diverse prompts and away from prior ideas, thus improving their collective creativity. Since there are diverse metrics of diversity, we present a Diversity Prompting Evaluation Framework consolidating metrics from several research disciplines to analyze along the ideation chain - prompt selection, prompt creativity, prompt-ideation mediation, and ideation creativity. Using this framework, we evaluated Directed Diversity in a series of a simulation study and four user studies for the use case of crowdsourcing motivational messages to encourage physical activity. We show that automated diverse prompting can variously improve collective creativity across many nuanced metrics of diversity.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Helping Users Tackle Algorithmic Threats on Social Media: A Multimedia Research Agenda
Authors:
Christian von der Weth,
Ashraf Abdul,
Shaojing Fan,
Mohan Kankanhalli
Abstract:
Participation on social media platforms has many benefits but also poses substantial threats. Users often face an unintended loss of privacy, are bombarded with mis-/disinformation, or are trapped in filter bubbles due to over-personalized content. These threats are further exacerbated by the rise of hidden AI-driven algorithms working behind the scenes to shape users' thoughts, attitudes, and beh…
▽ More
Participation on social media platforms has many benefits but also poses substantial threats. Users often face an unintended loss of privacy, are bombarded with mis-/disinformation, or are trapped in filter bubbles due to over-personalized content. These threats are further exacerbated by the rise of hidden AI-driven algorithms working behind the scenes to shape users' thoughts, attitudes, and behavior. We investigate how multimedia researchers can help tackle these problems to level the playing field for social media users. We perform a comprehensive survey of algorithmic threats on social media and use it as a lens to set a challenging but important research agenda for effective and real-time user nudging. We further implement a conceptual prototype and evaluate it with experts to supplement our research agenda. This paper calls for solutions that combat the algorithmic threats on social media by utilizing machine learning and multimedia content analysis techniques but in a transparent manner and for the benefit of the users.
△ Less
Submitted 26 August, 2020;
originally announced September 2020.
-
SPARQL for Networks of Embedded Systems
Authors:
Dennis Boldt,
Henning Hasemann,
Alexander Kröller,
Marcel Karnstedt,
Christian von der Weth
Abstract:
The Semantic Web (or Web of Data) represents the successful efforts towards linking and sharing data over the Web. The cornerstones of the Web of Data are RDF as data format and SPARQL as de-facto standard query language. Recent trends show the evolution of the Web of Data towards the Web of Things, integrating embedded devices and smart objects. Data stemming from such devices do not share a comm…
▽ More
The Semantic Web (or Web of Data) represents the successful efforts towards linking and sharing data over the Web. The cornerstones of the Web of Data are RDF as data format and SPARQL as de-facto standard query language. Recent trends show the evolution of the Web of Data towards the Web of Things, integrating embedded devices and smart objects. Data stemming from such devices do not share a common format, making the integration and querying impossible. To overcome this problem, we present our approach to make embedded systems first-class citizens of the Web of Things. Our framework abstracts from individual deployments to represent them as common data sources in line with the ideas behind the Semantic Web. This includes the execution of arbitrary SPARQL queries over the data from a pool of embedded devices and/or external data sources. Handling verbose RDF data and executing SPARQL queries in an embedded network poses major challenges to minimize the involved processing and communication cost. We therefore present an in-network query processor aiming to push processing steps onto devices. We demonstrate the practical application and the potential benefits of our framework in a comprehensive evaluation using a real-world deployment and a range of SPARQL queries stemming from a common use case of the Web of Things.
△ Less
Submitted 28 February, 2014;
originally announced February 2014.
-
Analysing Parallel and Passive Web Browsing Behavior and its Effects on Website Metrics
Authors:
Christian von der Weth,
Manfred Hauswirth
Abstract:
Getting deeper insights into the online browsing behavior of Web users has been a major research topic since the advent of the WWW. It provides useful information to optimize website design, Web browser design, search engines offerings, and online advertisement. We argue that new technologies and new services continue to have significant effects on the way how people browse the Web. For example, l…
▽ More
Getting deeper insights into the online browsing behavior of Web users has been a major research topic since the advent of the WWW. It provides useful information to optimize website design, Web browser design, search engines offerings, and online advertisement. We argue that new technologies and new services continue to have significant effects on the way how people browse the Web. For example, listening to music clips on YouTube or to a radio station on Last.fm does not require users to sit in front of their computer. Social media and networking sites like Facebook or micro-blogging sites like Twitter have attracted new types of users that previously were less inclined to go online. These changes in how people browse the Web feature new characteristics which are not well understood so far. In this paper, we provide novel and unique insights by presenting first results of DOBBS, our long-term effort to create a comprehensive and representative dataset capturing online user behavior. We firstly investigate the concepts of parallel browsing and passive browsing, showing that browsing the Web is no longer a dedicated task for many users. Based on these results, we then analyze their impact on the calculation of a user's dwell time -- i.e., the time the user spends on a webpage -- which has become an important metric to quantify the popularity of websites.
△ Less
Submitted 21 February, 2014;
originally announced February 2014.
-
Virtual Location-Based Services: Merging the Physical and Virtual World
Authors:
Christian von der Weth,
Vinod Hegde,
Manfred Hauswirth
Abstract:
Location-based services gained much popularity through providing users with helpful information with respect to their current location. The search and recommendation of nearby locations or places, and the navigation to a specific location are some of the most prominent location-based services. As a recent trend, virtual location-based services consider webpages or sites associated with a location…
▽ More
Location-based services gained much popularity through providing users with helpful information with respect to their current location. The search and recommendation of nearby locations or places, and the navigation to a specific location are some of the most prominent location-based services. As a recent trend, virtual location-based services consider webpages or sites associated with a location as 'virtual locations' that online users can visit in spite of not being physically present at the location. The presence of links between virtual locations and the corresponding physical locations (e.g., geo-location information of a restaurant linked to its website), allows for novel types of services and applications which constitute virtual location-based services (VLBS). The quality and potential benefits of such services largely depends on the existence of websites referring to physical locations. In this paper, we investigate the usefulness of linking virtual and physical locations. For this, we analyze the presence and distribution of virtual locations, i.e., websites referring to places, for two Irish cities. Using simulated tracks based on a user movement model, we investigate how mobile users move through the Web as virtual space. Our results show that virtual locations are omnipresent in urban areas, and that the situation that a user is close to even several such locations at any time is rather the normal case instead of the exception.
△ Less
Submitted 10 October, 2013;
originally announced October 2013.
-
Finding Information Through Integrated Ad-Hoc Socializing in the Virtual and Physical World
Authors:
Christian von der Weth,
Manfred Hauswirth
Abstract:
Despite the services of sophisticated search engines like Google, there are a number of interesting information sources which are useful but largely inaccessible to current Web users. These information sources are often ad-hoc, location-specific and only useful for users over short periods of time, or relate to tacit knowledge of users or implicit knowledge in crowds. The solution presented in thi…
▽ More
Despite the services of sophisticated search engines like Google, there are a number of interesting information sources which are useful but largely inaccessible to current Web users. These information sources are often ad-hoc, location-specific and only useful for users over short periods of time, or relate to tacit knowledge of users or implicit knowledge in crowds. The solution presented in this paper addresses these problems by introducing an integrated concept of "location" and "presence" across the physical and virtual worlds enabling ad-hoc socializing of users interested in, or looking for similar information. While the definition of presence in the physical world is straightforward - through a spatial location and vicinity at a certain point in time - their definitions in the virtual world are neither obvious nor trivial. Based on a detailed analysis we provide an integrated spatial model spanning both worlds which enables us to define presence of users in a unified way. This integrated model allows us to enable ad-hoc socializing of users browsing the Web with users in the physical world specific to their joint information needs and allows us to unlock the untapped information sources mentioned above. We describe a proof-of-concept implementation of our model and provide an empirical analysis based on real-world experiments.
△ Less
Submitted 5 July, 2013;
originally announced July 2013.
-
DOBBS: Towards a Comprehensive Dataset to Study the Browsing Behavior of Online Users
Authors:
Christian von der Weth,
Manfred Hauswirth
Abstract:
The investigation of the browsing behavior of users provides useful information to optimize web site design, web browser design, search engines offerings, and online advertisement. This has been a topic of active research since the Web started and a large body of work exists. However, new online services as well as advances in Web and mobile technologies clearly changed the meaning behind "browsin…
▽ More
The investigation of the browsing behavior of users provides useful information to optimize web site design, web browser design, search engines offerings, and online advertisement. This has been a topic of active research since the Web started and a large body of work exists. However, new online services as well as advances in Web and mobile technologies clearly changed the meaning behind "browsing the Web" and require a fresh look at the problem and research, specifically in respect to whether the used models are still appropriate. Platforms such as YouTube, Netflix or last.fm have started to replace the traditional media channels (cinema, television, radio) and media distribution formats (CD, DVD, Blu-ray). Social networks (e.g., Facebook) and platforms for browser games attracted whole new, particularly less tech-savvy audiences. Furthermore, advances in mobile technologies and devices made browsing "on-the-move" the norm and changed the user behavior as in the mobile case browsing is often being influenced by the user's location and context in the physical world. Commonly used datasets, such as web server access logs or search engines transaction logs, are inherently not capable of capturing the browsing behavior of users in all these facets. DOBBS (DERI Online Behavior Study) is an effort to create such a dataset in a non-intrusive, completely anonymous and privacy-preserving way. To this end, DOBBS provides a browser add-on that users can install, which keeps track of their browsing behavior (e.g., how much time they spent on the Web, how long they stay on a website, how often they visit a website, how they use their browser, etc.). In this paper, we outline the motivation behind DOBBS, describe the add-on and captured data in detail, and present some first results to highlight the strengths of DOBBS.
△ Less
Submitted 5 July, 2013;
originally announced July 2013.
-
GutenTag: A Multi-Term Caching Optimized Tag Query Processor for Key-Value Based NoSQL Storage Systems
Authors:
Christian von der Weth,
Anwitaman Datta
Abstract:
NoSQL systems are more and more deployed as back-end infrastructure for large-scale distributed online platforms like Google, Amazon or Facebook. Their applicability results from the fact that most services of online platforms access the stored data objects via their primary key. However, NoSQL systems do not efficiently support services referring more than one data object, e.g. the term-based sea…
▽ More
NoSQL systems are more and more deployed as back-end infrastructure for large-scale distributed online platforms like Google, Amazon or Facebook. Their applicability results from the fact that most services of online platforms access the stored data objects via their primary key. However, NoSQL systems do not efficiently support services referring more than one data object, e.g. the term-based search for data objects. To address this issue we propose our architecture based on an inverted index on top of a NoSQL system. For queries comprising more than one term, distributed indices yield a limited performance in large distributed systems. We propose two extensions to cope with this challenge. Firstly, we store index entries not only for single term but also for a selected set of term combinations depending on their popularity derived from a query history. Secondly, we additionally cache popular keys on gateway nodes, which are a common concept in real-world systems, acting as interface for services when accessing data objects in the back end. Our results show that we can significantly reduces the bandwidth consumption for processing queries, with an acceptable, marginal increase in the load of the gateway nodes.
△ Less
Submitted 23 May, 2011;
originally announced May 2011.