Skip to main content

Showing 1–50 of 71 results for author: Koch, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.04604  [pdf, ps, other

    cs.LG cs.CC cs.DS stat.ML

    Testing Juntas Optimally with Samples

    Authors: Lorenzo Beretta, Nathaniel Harms, Caleb Koch

    Abstract: We prove tight upper and lower bounds of $Θ\left(\tfrac{1}ε\left( \sqrt{2^k \log\binom{n}{k} } + \log\binom{n}{k} \right)\right)$ on the number of samples required for distribution-free $k$-junta testing. This is the first tight bound for testing a natural class of Boolean functions in the distribution-free sample-based model. Our bounds also hold for the feature selection problem, showing that a… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  2. arXiv:2504.16875  [pdf, other

    cs.LG

    Hybrid Reinforcement Learning and Model Predictive Control for Adaptive Control of Hydrogen-Diesel Dual-Fuel Combustion

    Authors: Julian Bedei, Murray McBain, Alexander Winkler, Charles Robert Koch, Jakob Andert, David Gordon

    Abstract: Reinforcement Learning (RL) and Machine Learning Integrated Model Predictive Control (ML-MPC) are promising approaches for optimizing hydrogen-diesel dual-fuel engine control, as they can effectively control multiple-input multiple-output systems and nonlinear processes. ML-MPC is advantageous for providing safe and optimal controls, ensuring the engine operates within predefined safety limits. In… ▽ More

    Submitted 6 May, 2025; v1 submitted 23 April, 2025; originally announced April 2025.

  3. arXiv:2504.11259  [pdf, ps, other

    cs.DB

    The Cambridge Report on Database Research

    Authors: Anastasia Ailamaki, Samuel Madden, Daniel Abadi, Gustavo Alonso, Sihem Amer-Yahia, Magdalena Balazinska, Philip A. Bernstein, Peter Boncz, Michael Cafarella, Surajit Chaudhuri, Susan Davidson, David DeWitt, Yanlei Diao, Xin Luna Dong, Michael Franklin, Juliana Freire, Johannes Gehrke, Alon Halevy, Joseph M. Hellerstein, Mark D. Hill, Stratos Idreos, Yannis Ioannidis, Christoph Koch, Donald Kossmann, Tim Kraska , et al. (21 additional authors not shown)

    Abstract: On October 19 and 20, 2023, the authors of this report convened in Cambridge, MA, to discuss the state of the database research field, its recent accomplishments and ongoing challenges, and future directions for research and community engagement. This gathering continues a long standing tradition in the database community, dating back to the late 1980s, in which researchers meet roughly every five… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  4. Using Process Calculus for Optimizing Data and Computation Sharing in Complex Stateful Parallel Computations

    Authors: Zilu Tian, Dan Olteanu, Christoph Koch

    Abstract: We propose novel techniques that exploit data and computation sharing to improve the performance of complex stateful parallel computations, like agent-based simulations. Parallel computations are translated into behavioral equations, a novel formalism layered on top of the foundational process calculus $π$-calculus. Behavioral equations blend code and data, allowing a system to easily compose and… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

    Comments: To appear on SIGMOD'25

  5. arXiv:2412.16720  [pdf, other

    cs.AI

    OpenAI o1 System Card

    Authors: OpenAI, :, Aaron Jaech, Adam Kalai, Adam Lerer, Adam Richardson, Ahmed El-Kishky, Aiden Low, Alec Helyar, Aleksander Madry, Alex Beutel, Alex Carney, Alex Iftimie, Alex Karpenko, Alex Tachard Passos, Alexander Neitz, Alexander Prokofiev, Alexander Wei, Allison Tam, Ally Bennett, Ananya Kumar, Andre Saraiva, Andrea Vallone, Andrew Duberstein, Andrew Kondrich , et al. (238 additional authors not shown)

    Abstract: The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignment. This leads to state-of-the-ar… ▽ More

    Submitted 21 December, 2024; originally announced December 2024.

  6. arXiv:2412.04571  [pdf, other

    cs.AI cs.CY q-bio.NC

    Dissociating Artificial Intelligence from Artificial Consciousness

    Authors: Graham Findlay, William Marshall, Larissa Albantakis, Isaac David, William GP Mayner, Christof Koch, Giulio Tononi

    Abstract: Developments in machine learning and computing power suggest that artificial general intelligence is within reach. This raises the question of artificial consciousness: if a computer were to be functionally equivalent to a human, being able to do all we do, would it experience sights, sounds, and thoughts, as we do when we are conscious? Answering this question in a principled manner can only be d… ▽ More

    Submitted 3 March, 2025; v1 submitted 5 December, 2024; originally announced December 2024.

  7. arXiv:2410.21276  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.LG cs.SD eess.AS

    GPT-4o System Card

    Authors: OpenAI, :, Aaron Hurst, Adam Lerer, Adam P. Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, Aleksander Mądry, Alex Baker-Whitcomb, Alex Beutel, Alex Borzunov, Alex Carney, Alex Chow, Alex Kirillov, Alex Nichol, Alex Paino, Alex Renzin, Alex Tachard Passos, Alexander Kirillov, Alexi Christakis , et al. (395 additional authors not shown)

    Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 mil… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  8. arXiv:2409.13096  [pdf, ps, other

    cs.CC cs.DS cs.LG

    Fast decision tree learning solves hard coding-theoretic problems

    Authors: Caleb Koch, Carmen Strassle, Li-Yang Tan

    Abstract: We connect the problem of properly PAC learning decision trees to the parameterized Nearest Codeword Problem ($k$-NCP). Despite significant effort by the respective communities, algorithmic progress on both problems has been stuck: the fastest known algorithm for the former runs in quasipolynomial time (Ehrenfeucht and Haussler 1989) and the best known approximation ratio for the latter is… ▽ More

    Submitted 25 September, 2024; v1 submitted 19 September, 2024; originally announced September 2024.

    Comments: 31 pages, FOCS 2024

  9. arXiv:2409.11597  [pdf, ps, other

    cs.CC cs.DS cs.LG stat.ML

    The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem

    Authors: Guy Blanc, Alexandre Hayderi, Caleb Koch, Li-Yang Tan

    Abstract: Smooth boosters generate distributions that do not place too much weight on any given example. Originally introduced for their noise-tolerant properties, such boosters have also found applications in differential privacy, reproducibility, and quantum learning theory. We study and settle the sample complexity of smooth boosting: we exhibit a class that can be weak learned to $γ$-advantage over smoo… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

    Comments: 46 pages, FOCS 2024

  10. arXiv:2407.16867  [pdf, other

    cond-mat.mtrl-sci cs.LG

    From Text to Insight: Large Language Models for Materials Science Data Extraction

    Authors: Mara Schilling-Wilhelmi, Martiño Ríos-García, Sherjeel Shabih, María Victoria Gil, Santiago Miret, Christoph T. Koch, José A. Márquez, Kevin Maik Jablonka

    Abstract: The vast majority of materials science knowledge exists in unstructured natural language, yet structured data is crucial for innovative and systematic materials design. Traditionally, the field has relied on manual curation and partial automation for data extraction for specific use cases. The advent of large language models (LLMs) represents a significant shift, potentially enabling efficient ext… ▽ More

    Submitted 2 December, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

  11. arXiv:2407.01402  [pdf, ps, other

    cs.CC cs.DS cs.LG

    Superconstant Inapproximability of Decision Tree Learning

    Authors: Caleb Koch, Carmen Strassle, Li-Yang Tan

    Abstract: We consider the task of properly PAC learning decision trees with queries. Recent work of Koch, Strassle, and Tan showed that the strictest version of this task, where the hypothesis tree $T$ is required to be optimally small, is NP-hard. Their work leaves open the question of whether the task remains intractable if $T$ is only required to be close to optimal, say within a factor of 2, rather than… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 29 pages, 5 figures, COLT 2024

  12. arXiv:2405.16340  [pdf, ps, other

    cs.CC

    A Strong Direct Sum Theorem for Distributional Query Complexity

    Authors: Guy Blanc, Caleb Koch, Carmen Strassle, Li-Yang Tan

    Abstract: Consider the expected query complexity of computing the $k$-fold direct product $f^{\otimes k}$ of a function $f$ to error $\varepsilon$ with respect to a distribution $μ^k$. One strategy is to sequentially compute each of the $k$ copies to error $\varepsilon/k$ with respect to $μ$ and apply the union bound. We prove a strong direct sum theorem showing that this naive strategy is essentially optim… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 34 pages, 4 figures, CCC 2024

  13. Transfer of Reinforcement Learning-Based Controllers from Model- to Hardware-in-the-Loop

    Authors: Mario Picerno, Lucas Koch, Kevin Badalian, Marius Wegener, Joschka Schaub, Charles Robert Koch, Jakob Andert

    Abstract: The process of developing control functions for embedded systems is resource-, time-, and data-intensive, often resulting in sub-optimal cost and solutions approaches. Reinforcement Learning (RL) has great potential for autonomously training agents to perform complex control tasks with minimal human intervention. Due to costly data generation and safety constraints, however, its application is mos… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Journal ref: IEEE Transactions on Vehicular Technology (2025)

  14. Introducing a Deep Neural Network-based Model Predictive Control Framework for Rapid Controller Implementation

    Authors: David C. Gordon, Alexander Winkler, Julian Bedei, Patrick Schaber, Jakob Andert, Charles R. Koch

    Abstract: Model Predictive Control (MPC) provides an optimal control solution based on a cost function while allowing for the implementation of process constraints. As a model-based optimal control technique, the performance of MPC strongly depends on the model used where a trade-off between model computation time and prediction performance exists. One solution is the integration of MPC with a machine learn… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Submitted to 2024 American Control Conference (ACC), July 8-12, 2024 in Toronto, Canada. ACC is the annual conference of the American Automatic Control Council (AACC), the U.S. national member organization of the International Federation for Automatic Control (IFAC)

  15. arXiv:2307.04093  [pdf, ps, other

    cs.CC cs.DS cs.LG

    Properly Learning Decision Trees with Queries Is NP-Hard

    Authors: Caleb Koch, Carmen Strassle, Li-Yang Tan

    Abstract: We prove that it is NP-hard to properly PAC learn decision trees with queries, resolving a longstanding open problem in learning theory (Bshouty 1993; Guijarro-Lavin-Raghavan 1999; Mehta-Raghavan 2002; Feldman 2016). While there has been a long line of work, dating back to (Pitt-Valiant 1988), establishing the hardness of properly learning decision trees from random examples, the more challenging… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: 41 pages, 10 figures, FOCS 2023

  16. arXiv:2307.04039  [pdf, ps, other

    cs.CC cs.DS

    A Strong Composition Theorem for Junta Complexity and the Boosting of Property Testers

    Authors: Guy Blanc, Caleb Koch, Carmen Strassle, Li-Yang Tan

    Abstract: We prove a strong composition theorem for junta complexity and show how such theorems can be used to generically boost the performance of property testers. The $\varepsilon$-approximate junta complexity of a function $f$ is the smallest integer $r$ such that $f$ is $\varepsilon$-close to a function that depends only on $r$ variables. A strong composition theorem states that if $f$ has large… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: 44 pages, 1 figure, FOCS 2023

  17. arXiv:2302.08789  [pdf, other

    cs.DB

    Detecting Robustness against MVRC for Transaction Programs with Predicate Reads

    Authors: Brecht Vandevoort, Bas Ketsman, Christoph Koch, Frank Neven

    Abstract: The transactional robustness problem revolves around deciding whether, for a given workload, a lower isolation level than Serializable is sufficient to guarantee serializability. The paper presents a new characterization for robustness against isolation level (multi-version) Read Committed. It supports transaction programs with control structures (loops and conditionals) and inserts, deletes, and… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  18. arXiv:2211.02257  [pdf, ps, other

    cs.CC cs.DS

    Certification with an NP Oracle

    Authors: Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, Li-Yang Tan

    Abstract: In the certification problem, the algorithm is given a function $f$ with certificate complexity $k$ and an input $x^\star$, and the goal is to find a certificate of size $\le \text{poly}(k)$ for $f$'s value at $x^\star$. This problem is in $\mathsf{NP}^{\mathsf{NP}}$, and assuming $\mathsf{P} \ne \mathsf{NP}$, is not in $\mathsf{P}$. Prior works, dating back to Valiant in 1984, have therefore soug… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: 25 pages, 2 figures, ITCS 2023

  19. arXiv:2210.06375  [pdf, ps, other

    cs.CC cs.DS cs.LG

    Superpolynomial Lower Bounds for Decision Tree Learning and Testing

    Authors: Caleb Koch, Carmen Strassle, Li-Yang Tan

    Abstract: We establish new hardness results for decision tree optimization problems, adding to a line of work that dates back to Hyafil and Rivest in 1976. We prove, under randomized ETH, superpolynomial lower bounds for two basic problems: given an explicit representation of a function $f$ and a generator for a distribution $\mathcal{D}$, construct a small decision tree approximator for $f$ under… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: 44 pages, 5 figures. SODA 2023

  20. arXiv:2207.07072  [pdf, ps, other

    cs.DS cs.LG

    A Query-Optimal Algorithm for Finding Counterfactuals

    Authors: Guy Blanc, Caleb Koch, Jane Lange, Li-Yang Tan

    Abstract: We design an algorithm for finding counterfactuals with strong theoretical guarantees on its performance. For any monotone model $f : X^d \to \{0,1\}$ and instance $x^\star$, our algorithm makes \[ {S(f)^{O(Δ_f(x^\star))}\cdot \log d}\] queries to $f$ and returns {an {\sl optimal}} counterfactual for $x^\star$: a nearest instance $x'$ to $x^\star$ for which $f(x')\ne f(x^\star)$. Here $S(f)$ is th… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    Comments: 22 pages, ICML 2022

  21. arXiv:2203.15413  [pdf, other

    physics.comp-ph cs.CV cs.LG

    Deep Reinforcement Learning for Data-Driven Adaptive Scanning in Ptychography

    Authors: Marcel Schloz, Johannes Müller, Thomas C. Pekin, Wouter Van den Broek, Christoph T. Koch

    Abstract: We present a method that lowers the dose required for a ptychographic reconstruction by adaptively scanning the specimen, thereby providing the required spatial information redundancy in the regions of highest importance. The proposed method is built upon a deep learning model that is trained by reinforcement learning (RL), using prior knowledge of the specimen structure from training data sets. W… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: 12 pages, 8 figures

  22. arXiv:2203.12466  [pdf, other

    cs.CR

    Which programming languages do hackers use? A survey at the German Chaos Computer Club

    Authors: Christian Koch, Katharina Müller, Eldar Sultanow

    Abstract: There are numerous articles about the programming languages most commonly used by hackers. Among them, however, there are hardly any scientific studies. One reason might be that hackers mainly operate anonymously and are difficult to reach. This paper aims to shed light on this interesting and relevant research question. In order to find answers, we conducted a survey among the members of the Germ… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: 14 pages, 11 tables

  23. A Matheuristic Approach for Solving a Simultaneous Lot Sizing and Scheduling Problem with Client Prioritization in Tire Industry

    Authors: Cyril Koch, Taha Arbaoui, Yassine Ouazene, Farouk Yalaoui, Humbert De Brunier, Nicolas Jaunet, Antoine De Wulf

    Abstract: This paper introduces an integrated lot sizing and scheduling problem inspired from a real-world application in off-the-road tire industry. This problem considers the assignment of different items on parallel machines with complex eligibility constraints within a finite planning horizon. It also considers a large panel of specific constraints such as: backordering, a limited number of setups, upst… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

  24. arXiv:2201.07736  [pdf, ps, other

    cs.DS cs.CC

    The Query Complexity of Certification

    Authors: Guy Blanc, Caleb Koch, Jane Lange, Li-Yang Tan

    Abstract: We study the problem of {\sl certification}: given queries to a function $f : \{0,1\}^n \to \{0,1\}$ with certificate complexity $\le k$ and an input $x^\star$, output a size-$k$ certificate for $f$'s value on $x^\star$. This abstractly models a central problem in explainable machine learning, where we think of $f$ as a blackbox model that we seek to explain the predictions of. For monotone func… ▽ More

    Submitted 6 April, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: 30 pages, to appear in STOC'22. Edit: fixed typos and added references

  25. Robustness against Read Committed for Transaction Templates with Functional Constraints

    Authors: Brecht Vandevoort, Bas Ketsman, Christoph Koch, Frank Neven

    Abstract: The popular isolation level Multiversion Read Committed (RC) trades some of the strong guarantees of serializability for increased transaction throughput. Sometimes, transaction workloads can be safely executed under RC obtaining serializability at the lower cost of RC. Such workloads are said to be robust against RC. Previous work has yielded a tractable procedure for deciding robustness against… ▽ More

    Submitted 22 December, 2023; v1 submitted 13 January, 2022; originally announced January 2022.

    Journal ref: Logical Methods in Computer Science, Volume 19, Issue 4 (December 25, 2023) lmcs:10173

  26. arXiv:2110.06640  [pdf, other

    cs.CV cs.LG

    Detecting Slag Formations with Deep Convolutional Neural Networks

    Authors: Christian von Koch, William Anzén, Max Fischer, Raazesh Sainudiin

    Abstract: We investigate the ability to detect slag formations in images from inside a Grate-Kiln system furnace with two deep convolutional neural networks. The conditions inside the furnace cause occasional obstructions of the camera view. Our approach suggests dealing with this problem by introducing a convLSTM-layer in the deep convolutional neural network. The results show that it is possible to achiev… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Comments: 15 pages, 6 figures, to be published in the proceedings of DAGM German Conference on Pattern Recognition 2021

  27. arXiv:2107.12239  [pdf, other

    cs.DB

    Robustness against Read Committed for Transaction Templates

    Authors: Brecht Vandevoort, Bas Ketsman, Christoph Koch, Frank Neven

    Abstract: The isolation level Multiversion Read Committed (RC), offered by many database systems, is known to trade consistency for increased transaction throughput. Sometimes, transaction workloads can be safely executed under RC obtaining the perfect isolation of serializability at the lower cost of RC. To identify such cases, we introduce an expressive model of transaction programs to better reason about… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

  28. arXiv:2104.05661  [pdf, other

    cs.LG cs.AI cs.RO

    Extraction and Analysis of Highway On-Ramp Merging Scenarios from Naturalistic Trajectory Data

    Authors: Lars Klitzke, Kay Gimm, Carsten Koch, Frank Köster

    Abstract: Connected and Automated Vehicles (CAVs) are envisioned to transform the future industrial and private transportation sectors. However, due to the system's enormous complexity, functional verification and validation of safety aspects are essential before the technology merges into the public domain. Therefore, in recent years, a scenario-driven approach has gained acceptance, emphasizing the requir… ▽ More

    Submitted 3 March, 2022; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: 7 pages

    ACM Class: I.2.8; I.5.4; I.6.5

  29. arXiv:1910.02397  [pdf, other

    cs.NI

    Increasing the Quality of 360° Video Streaming by Transitioning between Viewport Quality Adaptation Mechanisms

    Authors: Christian Koch, Arne-Tobias Rak, Michael Zink, Ralf Steinmetz, Amr Rizk

    Abstract: Virtual reality has been gaining popularity in recent years caused by the proliferation of affordable consumer-grade devices such as Oculus Rift, HTC Vive, and Samsung VR. Amongst the various VR applications, 360° video streaming is currently one of the most popular ones. It allows user to change their field-of-view (FoV) based on head movement, which enables them to freely select an area anywhere… ▽ More

    Submitted 6 October, 2019; originally announced October 2019.

    Comments: Our code: https://github.com/arizk/360transitions

  30. arXiv:1808.01344  [pdf, other

    cs.PL

    A Compiler-Compiler for DSL Embedding

    Authors: Amir Shaikhha, Vojin Jovanovic, Christoph Koch

    Abstract: In this paper, we present a framework to generate compilers for embedded domain-specific languages (EDSLs). This framework provides facilities to automatically generate the boilerplate code required for building DSL compilers on top of extensible optimizing compilers. We evaluate the practicality of our framework by demonstrating several use-cases successfully built with it.

    Submitted 3 August, 2018; originally announced August 2018.

  31. arXiv:1807.09887  [pdf, other

    cs.DB

    Compiling Database Application Programs

    Authors: Mohammad Dashti, Sachin Basil John, Thierry Coppey, Amir Shaikhha, Vojin Jovanovic, Christoph Koch

    Abstract: There is a trend towards increased specialization of data management software for performance reasons. In this paper, we study the automatic specialization and optimization of database application programs -- sequences of queries and updates, augmented with control flow constructs as they appear in database scripts, UDFs, transactional workloads and triggers in languages such as PL/SQL. We show ho… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: 16 pages

    ACM Class: H.2.4

  32. arXiv:1807.02020  [pdf, other

    cs.CV

    Detection and Analysis of Content Creator Collaborations in YouTube Videos using Face- and Speaker-Recognition

    Authors: Moritz Lode, Michael Örtl, Christian Koch, Amr Rizk, Ralf Steinmetz

    Abstract: This work discusses and implements the application of speaker recognition for the detection of collaborations in YouTube videos. CATANA, an existing framework for detection and analysis of YouTube collaborations, is utilizing face recognition for the detection of collaborators, which naturally performs poor on video-content without appearing faces. This work proposes an extension of CATANA using a… ▽ More

    Submitted 5 July, 2018; originally announced July 2018.

  33. arXiv:1806.02136  [pdf, other

    cs.MS cs.LG cs.PL cs.SC stat.ML

    Efficient Differentiable Programming in a Functional Array-Processing Language

    Authors: Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, Simon Peyton Jones, Christoph Koch

    Abstract: We present a system for the automatic differentiation of a higher-order functional array-processing language. The core functional language underlying this system simultaneously supports both source-to-source automatic differentiation and global optimizations such as loop transformations. Thanks to this feature, we demonstrate how for some real-world machine learning and computer vision benchmarks,… ▽ More

    Submitted 6 June, 2018; originally announced June 2018.

  34. arXiv:1805.01887  [pdf, other

    cs.CV cs.SI

    Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularity

    Authors: Christian Koch, Moritz Lode, Denny Stohr, Amr Rizk, Ralf Steinmetz

    Abstract: YouTube is one of the most popular platforms for streaming of user-generated video. Nowadays, professional YouTubers are organized in so called multi-channel networks (MCNs). These networks offer services such as brand deals, equipment, and strategic advice in exchange for a share of the YouTubers' revenue. A major strategy to gain more subscribers and, hence, revenue is collaborating with other Y… ▽ More

    Submitted 1 May, 2018; originally announced May 2018.

    Comments: 28 pages, 21 figures

  35. arXiv:1709.02823  [pdf, other

    cs.SE

    Java Extensions for OMNeT++

    Authors: Henning Puttnies, Peter Danielis, Christian Koch, Dirk Timmermann

    Abstract: On the one side, network simulation frameworks are important tools for research and development activities to evaluate novel approaches in a time- and cost-efficient way. On the other side, Java as a highly platform-independent programming language is ideally suited for rapid prototyping in heterogeneous scenarios. Consequently, Java simulation frameworks could be used to firstly perform functiona… ▽ More

    Submitted 8 September, 2017; originally announced September 2017.

    Comments: Published in: A. Foerster, A. Udugama, A. Koensgen, A. Virdis, M. Kirsche (Eds.), Proc. of the 4th OMNeT++ Community Summit, University of Bremen - Germany - September 7-8, 2017

    Report number: OMNET/2017/10 ACM Class: I.6.5; I.6.7

  36. arXiv:1707.09422  [pdf, other

    cs.NI cs.AI

    Hyperprofile-based Computation Offloading for Mobile Edge Networks

    Authors: Andrew Crutcher, Caleb Koch, Kyle Coleman, Jon Patman, Flavio Esposito, Prasad Calyam

    Abstract: In recent studies, researchers have developed various computation offloading frameworks for bringing cloud services closer to the user via edge networks. Specifically, an edge device needs to offload computationally intensive tasks because of energy and processing constraints. These constraints present the challenge of identifying which edge nodes should receive tasks to reduce overall resource co… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

    Comments: 5 pages, NSF REU Site publication

  37. arXiv:1612.05566  [pdf, other

    cs.DB

    Building Efficient Query Engines in a High-Level Language

    Authors: Amir Shaikhha, Yannis Klonatos, Christoph Koch

    Abstract: Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance, instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to… ▽ More

    Submitted 16 December, 2016; originally announced December 2016.

  38. arXiv:1610.09166  [pdf, other

    cs.DB cs.PL

    Push vs. Pull-Based Loop Fusion in Query Engines

    Authors: Amir Shaikhha, Mohammad Dashti, Christoph Koch

    Abstract: Database query engines use pull-based or push-based approaches to avoid the materialization of data across query operators. In this paper, we study these two types of query engines in depth and present the limitations and advantages of each engine. Similarly, the programming languages community has developed loop fusion techniques to remove intermediate collections in the context of collection pro… ▽ More

    Submitted 28 October, 2016; originally announced October 2016.

  39. arXiv:1603.02057  [pdf, other

    math.PR cs.SI math.CO

    Bootstrap percolation on geometric inhomogeneous random graphs

    Authors: Christoph Koch, Johannes Lengler

    Abstract: Geometric inhomogeneous random graphs (GIRGs) are a model for scale-free networks with underlying geometry. We study bootstrap percolation on these graphs, which is a process modelling the spread of an infection of vertices starting within a (small) local region. We show that the process exhibits a phase transition in terms of the initial infection rate in this region. We determine the speed of th… ▽ More

    Submitted 1 September, 2020; v1 submitted 18 February, 2016; originally announced March 2016.

    Comments: 36 pages, 1 figure

    MSC Class: 05C80; 05C82; 60C05; 60K35; 91D25 ACM Class: C.2.1; G.2.2

  40. arXiv:1603.00542  [pdf, other

    cs.DB

    Repairing Conflicts among MVCC Transactions

    Authors: Mohammad Dashti, Sachin Basil John, Amir Shaikhha, Christoph Koch

    Abstract: The optimistic variants of MVCC (Multi-Version Concurrency Control) avoid blocking concurrent transactions at the cost of having a validation phase. Upon failure in the validation phase, the transaction is usually aborted and restarted from scratch. The "abort and restart" approach becomes a performance bottleneck for the use cases with high contention objects or long running transactions. In addi… ▽ More

    Submitted 1 March, 2016; originally announced March 2016.

    Comments: 12 pages, 9 figures

    ACM Class: H.2.4

  41. arXiv:1603.00400  [pdf, ps, other

    cs.DB

    A Fast Randomized Algorithm for Multi-Objective Query Optimization

    Authors: Immanuel Trummer, Christoph Koch

    Abstract: Query plans are compared according to multiple cost metrics in multi-objective query optimization. The goal is to find the set of Pareto plans realizing optimal cost tradeoffs for a given query. So far, only algorithms with exponential complexity in the number of query tables have been proposed for multi-objective query optimization. In this work, we present the first algorithm with polynomial com… ▽ More

    Submitted 1 March, 2016; originally announced March 2016.

  42. arXiv:1511.02071  [pdf, ps, other

    cs.DB

    Solving the Join Ordering Problem via Mixed Integer Linear Programming

    Authors: Immanuel Trummer, Christoph Koch

    Abstract: We transform join ordering into a mixed integer linear program (MILP). This allows to address query optimization by mature MILP solver implementations that have evolved over decades and steadily improved their performance. They offer features such as anytime optimization and parallel search that are highly relevant for query optimization. We present a MILP formulation for searching left-deep que… ▽ More

    Submitted 6 November, 2015; originally announced November 2015.

  43. arXiv:1511.01782  [pdf, ps, other

    cs.DB

    Probably Approximately Optimal Query Optimization

    Authors: Immanuel Trummer, Christoph Koch

    Abstract: Evaluating query predicates on data samples is the only way to estimate their selectivity in certain scenarios. Finding a guaranteed optimal query plan is not a reasonable optimization goal in those cases as it might require an infinite number of samples. We therefore introduce probably approximately optimal query optimization (PAO) where the goal is to find a query plan whose cost is near-optimal… ▽ More

    Submitted 5 November, 2015; originally announced November 2015.

  44. arXiv:1511.01768  [pdf, ps, other

    cs.DB

    Parallelizing Query Optimization on Shared-Nothing Architectures

    Authors: Immanuel Trummer, Christoph Koch

    Abstract: Data processing systems offer an ever increasing degree of parallelism on the levels of cores, CPUs, and processing nodes. Query optimization must exploit high degrees of parallelism in order not to gradually become the bottleneck of query evaluation. We show how to parallelize query optimization at a massive scale. We present algorithms for parallel query optimization in left-deep and bushy pla… ▽ More

    Submitted 5 November, 2015; originally announced November 2015.

  45. arXiv:1510.06437  [pdf, other

    cs.DB quant-ph

    Multiple Query Optimization on the D-Wave 2X Adiabatic Quantum Computer

    Authors: Immanuel Trummer, Christoph Koch

    Abstract: The D-Wave adiabatic quantum annealer solves hard combinatorial optimization problems leveraging quantum physics. The newest version features over 1000 qubits and was released in August 2015. We were given access to such a machine, currently hosted at NASA Ames Research Center in California, to explore the potential for hard optimization problems that arise in the context of databases. In this p… ▽ More

    Submitted 21 October, 2015; originally announced October 2015.

  46. arXiv:1412.4320  [pdf, ps, other

    cs.DB

    Incremental View Maintenance For Collection Programming

    Authors: Christoph Koch, Daniel Lupei, Val Tannen

    Abstract: In the context of incremental view maintenance (IVM), delta query derivation is an essential technique for speeding up the processing of large, dynamic datasets. The goal is to generate delta queries that, given a small change in the input, can update the materialized view more efficiently than via recomputation. In this work we propose the first solution for the efficient incrementalization of po… ▽ More

    Submitted 11 April, 2016; v1 submitted 14 December, 2014; originally announced December 2014.

    Comments: 24 pages (12 pages plus appendix)

  47. arXiv:1406.2807  [pdf, other

    cs.CV

    The Secrets of Salient Object Segmentation

    Authors: Yin Li, Xiaodi Hou, Christof Koch, James M. Rehg, Alan L. Yuille

    Abstract: In this paper we provide an extensive evaluation of fixation prediction and salient object segmentation algorithms as well as statistics of major datasets. Our analysis identifies serious design flaws of existing salient object benchmarks, called the dataset design bias, by over emphasizing the stereotypical concepts of saliency. The dataset design bias does not only create the discomforting disco… ▽ More

    Submitted 12 June, 2014; v1 submitted 11 June, 2014; originally announced June 2014.

    Comments: 15 pages, 8 figures. Conference version was accepted by CVPR 2014

    Report number: CBMM Memmo #14

  48. arXiv:1404.0046  [pdf, other

    cs.DB

    Approximation Schemes for Many-Objective Query Optimization

    Authors: Immanuel Trummer, Christoph Koch

    Abstract: The goal of multi-objective query optimization (MOQO) is to find query plans that realize a good compromise between conflicting objectives such as minimizing execution time and minimizing monetary fees in a Cloud scenario. A previously proposed exhaustive MOQO algorithm needs hours to optimize even simple TPC-H queries. This is why we propose several approximation schemes for MOQO that generate gu… ▽ More

    Submitted 31 March, 2014; originally announced April 2014.

  49. LINVIEW: Incremental View Maintenance for Complex Analytical Queries

    Authors: Milos Nikolic, Mohammed ElSeidy, Christoph Koch

    Abstract: Many analytics tasks and machine learning problems can be naturally expressed by iterative linear algebra programs. In this paper, we study the incremental view maintenance problem for such complex analytical queries. We develop a framework, called LINVIEW, for capturing deltas of linear algebra programs and understanding their computational cost. Linear algebra operations tend to cause an avalanc… ▽ More

    Submitted 9 May, 2014; v1 submitted 27 March, 2014; originally announced March 2014.

    Comments: 14 pages, SIGMOD

    ACM Class: H.2.4; G.1.3

  50. arXiv:1403.2307  [pdf, other

    cs.DB

    The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis

    Authors: Sudip Roy, Lucja Kot, Gabriel Bender, Bailu Ding, Hossein Hojjat, Christoph Koch, Nate Foster, Johannes Gehrke

    Abstract: Datastores today rely on distribution and replication to achieve improved performance and fault-tolerance. But correctness of many applications depends on strong consistency properties - something that can impose substantial overheads, since it requires coordinating the behavior of multiple nodes. This paper describes a new approach to achieving strong consistency in distributed systems while mini… ▽ More

    Submitted 19 January, 2015; v1 submitted 10 March, 2014; originally announced March 2014.