-
Fingerprinting New York City's Scaffolding Problem with Longitudinal Dashcam Data
Authors:
Dorin Shapira,
Matt Franchi,
Wendy Ju
Abstract:
Scaffolds, also called sidewalk sheds, are intended to be temporary structures to protect pedestrians from construction and repair hazards. However, some sidewalk sheds are left up for years. Long-term scaffolding becomes eyesores, creates accessibility issues on sidewalks, and gives cover to illicit activity. Today, there are over 8,000 active permits for scaffolds in NYC; the more problematic sc…
▽ More
Scaffolds, also called sidewalk sheds, are intended to be temporary structures to protect pedestrians from construction and repair hazards. However, some sidewalk sheds are left up for years. Long-term scaffolding becomes eyesores, creates accessibility issues on sidewalks, and gives cover to illicit activity. Today, there are over 8,000 active permits for scaffolds in NYC; the more problematic scaffolds are likely expired or unpermitted. This research uses computer vision on street-level imagery to develop a longitudinal map of scaffolding throughout the city. Using a dataset of 29,156,833 dashcam images taken between August 2023 and January 2024, we develop an algorithm to track the presence of scaffolding over time. We also design and implement methods to match detected scaffolds to reported locations of active scaffolding permits, enabling the identification of sidewalk sheds without corresponding permits. We identify 850,766 images of scaffolding, tagging 5,156 active sidewalk sheds and estimating 529 unpermitted sheds. We discuss the implications of an in-the-wild scaffolding classifier for urban tech, innovations to governmental inspection processes, and out-of-distribution evaluations outside of New York City.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Asymmetric Number Partitioning with Splitting and Interval Targets
Authors:
Samuel Bismuth,
Erel Segal-Halevi,
Dana Shapira
Abstract:
The n-way number partitioning problem, a fundamental challenge in combinatorial optimization, has significant implications for applications such as fair division and machine scheduling. Despite these problems being NP-hard, many approximation techniques exist. We consider three closely related kinds of approximations, and various objectives such as decision, min-max, max-min, and even a generalize…
▽ More
The n-way number partitioning problem, a fundamental challenge in combinatorial optimization, has significant implications for applications such as fair division and machine scheduling. Despite these problems being NP-hard, many approximation techniques exist. We consider three closely related kinds of approximations, and various objectives such as decision, min-max, max-min, and even a generalized objective, in which the bins are not considered identical anymore, but rather asymmetric (used to solve fair division to asymmetric agents or uniform machine scheduling problems).
The first two variants optimize the partition such that: in the first variant some fixed number s of items can be split between two or more bins and in the second variant we allow at most a fixed number t of splittings. The third variant is a decision problem: the largest bin sum must be within a pre-specified interval, parameterized by a fixed rational number u times the largest item size.
When the number of bins n is unbounded, we show that every variant is strongly NP-complete. When the number of bins n is fixed, the running time depends on the fixed parameters s,t,u. For each variant, we give a complete picture of its running time.
For n=2, the running time is easy to identify. Our main results consider any fixed n>=3. Using a two-way polynomial-time reduction between the first and the third variant, we show that n-way number-partitioning with s split items can be solved in polynomial time if s>=n-2, and it is NP-complete otherwise. Also, n-way number-partitioning with t splittings can be solved in polynomial time if t>=n-1, and it is NP-complete otherwise. Finally, we show that the third variant can be solved in polynomial time if u>=(n-2)/n, and it is NP-complete otherwise. Our positive results for the optimization problems consider both asymmetric min-max and asymmetric max-min versions.
△ Less
Submitted 3 April, 2025; v1 submitted 25 April, 2022;
originally announced April 2022.
-
Weighted Burrows-Wheeler Compression
Authors:
Aharon Fruchtman,
Yoav Gross,
Shmuel T. Klein,
Dana Shapira
Abstract:
A weight based dynamic compression method has recently been proposed, which is especially suitable for the encoding of files with locally skewed distributions. Its main idea is to assign larger weights to closer to be encoded symbols by means of an increasing weight function, rather than considering each position in the text evenly. A well known transformation that tends to convert input files int…
▽ More
A weight based dynamic compression method has recently been proposed, which is especially suitable for the encoding of files with locally skewed distributions. Its main idea is to assign larger weights to closer to be encoded symbols by means of an increasing weight function, rather than considering each position in the text evenly. A well known transformation that tends to convert input files into files with a more skewed distribution is the Burrows-Wheeler Transform. This paper employs the weighted approach on Burrows-Wheeler transformed files and provides empirical evidence of the efficiency of this combination.
△ Less
Submitted 21 May, 2021;
originally announced May 2021.
-
Weighted Adaptive Coding
Authors:
Aharon Fruchtman,
Yoav Gross,
Shmuel T. Klein,
Dana Shapira
Abstract:
Huffman coding is known to be optimal, yet its dynamic version may be even more efficient in practice. A new variant of Huffman encoding has been proposed recently, that provably always performs better than static Huffman coding by at least $m-1$ bits, where $m$ denotes the size of the alphabet, and has a better worst case than the standard dynamic Huffman coding. This paper introduces a new gener…
▽ More
Huffman coding is known to be optimal, yet its dynamic version may be even more efficient in practice. A new variant of Huffman encoding has been proposed recently, that provably always performs better than static Huffman coding by at least $m-1$ bits, where $m$ denotes the size of the alphabet, and has a better worst case than the standard dynamic Huffman coding. This paper introduces a new generic coding method, extending the known static and dynamic variants and including them as special cases. In fact, the generalization is applicable to all statistical methods, including arithmetic coding. This leads then to the formalization of a new adaptive coding method, which is provably always at least as good as the best dynamic variant known to date. Moreover, we present empirical results that show improvements over static and dynamic Huffman and arithmetic coding achieved by the proposed method, even when the encoded file includes the model description.
△ Less
Submitted 17 May, 2020;
originally announced May 2020.
-
Sustainable Online Communities Exhibit Distinct Hierarchical Structures Across Scales of Size
Authors:
Yaniv Dover,
Jacob Goldenberg,
Daniel Shapira
Abstract:
Online communities exist in many forms and sizes, and are a source of considerable influence for individuals and organizations. Yet, there is limited insight into why some online communities are sustainable, while others cease to exist. We find that communities that fail to maintain a typical hierarchical social structure which balances cohesiveness across size scales do not survive, and can be di…
▽ More
Online communities exist in many forms and sizes, and are a source of considerable influence for individuals and organizations. Yet, there is limited insight into why some online communities are sustainable, while others cease to exist. We find that communities that fail to maintain a typical hierarchical social structure which balances cohesiveness across size scales do not survive, and can be distinguished from communities that exhibit such balance and prevail in the long term. Moreover, in an analysis of 10,122 real-life online communities with a total of 134,747 members over a period of more than a decade, we find that mapping the community social circle structure in the first 30 days of its lifetime is sufficient to forecast the survival of the community up to ten years in the future. By varying calibration time frames, the aspects of the social structure that allows for predictive power emerge and fixate within the first couple of months in a community's lifetime.
△ Less
Submitted 20 March, 2018;
originally announced March 2018.
-
Scheduling with regular performance measures and optional job rejection on a single machine
Authors:
Baruch Mor,
Dana Shapira
Abstract:
We address single machine problems with optional jobs - rejection, studied recently in Zhang et al. [21] and Cao et al. [2]. In these papers, the authors focus on minimizing regular performance measures, i.e., functions that are non-decreasing in the jobs completion time, subject to the constraint that the total rejection cost cannot exceed a predefined upper bound. They also prove that the consid…
▽ More
We address single machine problems with optional jobs - rejection, studied recently in Zhang et al. [21] and Cao et al. [2]. In these papers, the authors focus on minimizing regular performance measures, i.e., functions that are non-decreasing in the jobs completion time, subject to the constraint that the total rejection cost cannot exceed a predefined upper bound. They also prove that the considered problems are ordinary NP-hard and provide pseudo-polynomial-time Dynamic Programming (DP) solutions. In this paper, we focus on three of these problems: makespan with release-dates; total completion times; and total weighted completion, and present enhanced DP solutions demonstrating both theoretical and practical improvements. Moreover, we provide extensive numerical studies verifying their efficiency.
△ Less
Submitted 1 February, 2018; v1 submitted 10 November, 2017;
originally announced November 2017.