Skip to main content

Showing 1–24 of 24 results for author: Horton, N J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.03232  [pdf, ps, other

    stat.OT cs.CY

    Pivoting the paradigm: the role of spreadsheets in K-12 data science

    Authors: Oren Tirschwell, Nicholas Jon Horton

    Abstract: Spreadsheet tools are widely accessible to and commonly used by K-12 students and teachers. They have an important role in data collection and organization. Beyond data organization, spreadsheets also make data visible and easy to interact with, facilitating student engagement in data exploration and analysis. Though not suitable for all circumstances, spreadsheets can and do help foster data and… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  2. arXiv:2406.07756  [pdf, ps, other

    stat.ME

    The Exchangeability Assumption for Permutation Tests of Multiple Regression Models: Implications for Statistics and Data Science Educators

    Authors: Johanna Hardin, Lauren Quesada, Julie Ye, Nicholas J. Horton

    Abstract: Permutation tests are a powerful and flexible approach to inference via resampling. As computational methods become more ubiquitous in the statistics curriculum, use of permutation tests has become more tractable. At the heart of the permutation approach is the exchangeability assumption, which determines the appropriate null sampling distribution. We explore the exchangeability assumption in the… ▽ More

    Submitted 5 June, 2025; v1 submitted 11 June, 2024; originally announced June 2024.

  3. Guidelines and Best Practices to Share Deidentified Data and Code

    Authors: Nicholas J. Horton, Sara Stoudt

    Abstract: In 2022, the Journal of Statistics and Data Science Education (JSDSE) instituted augmented requirements for authors to post deidentified data and code underlying their papers. These changes were prompted by an increased focus on reproducibility and open science (NASEM 2019). A recent review of data availability practices noted that "such policies help increase the reproducibility of the published… ▽ More

    Submitted 8 July, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  4. Data science transfer pathways from associate's to bachelor's programs

    Authors: Benjamin S. Baumer, Nicholas J. Horton

    Abstract: A substantial fraction of students who complete their college education at a public university in the United States begin their journey at one of the 935 public two-year colleges. While the number of four-year colleges offering bachelor's degrees in data science continues to increase, data science instruction at many two-year colleges lags behind. A major impediment is the relative paucity of intr… ▽ More

    Submitted 6 January, 2023; v1 submitted 22 October, 2022; originally announced October 2022.

    MSC Class: 97B40; 62-07 ACM Class: K.3.2

  5. Fostering better coding practices for data scientists

    Authors: Randall Pruim, Maria-Cristiana Gîrjău, Nicholas J. Horton

    Abstract: Many data science students and practitioners don't see the value in making time to learn and adopt good coding practices as long as the code "works". However, code standards are an important part of modern data science practice, and they play an essential role in the development of data acumen. Good coding practices lead to more reliable code and save more time than they cost, making them importan… ▽ More

    Submitted 25 August, 2023; v1 submitted 8 October, 2022; originally announced October 2022.

    ACM Class: K.7.m

  6. Spam four ways: Making sense of text data

    Authors: Nicholas J. Horton, Jie Chao, William Finzer, Phebe Palmer

    Abstract: The world is full of text data, yet text analytics has not traditionally played a large part in statistics education. We consider four different ways to provide students with opportunities to explore whether email messages are unwanted correspondence (spam). Text from subject lines are used to identify features that can be used in classification. The approaches include use of a Model Eliciting Act… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: in press, CHANCE

  7. An educator's perspective of the tidyverse

    Authors: Mine Çetinkaya-Rundel, Johanna Hardin, Benjamin S. Baumer, Amelia McNamara, Nicholas J. Horton, Colin Rundel

    Abstract: Computing makes up a large and growing component of data science and statistics courses. Many of those courses, especially when taught by faculty who are statisticians by training, teach R as the programming language. A number of instructors have opted to build much of their teaching around use of the tidyverse. The tidyverse, in the words of its developers, "is a collection of R packages that sha… ▽ More

    Submitted 22 April, 2022; v1 submitted 7 August, 2021; originally announced August 2021.

  8. arXiv:2106.11209  [pdf, other

    stat.OT stat.CO

    Facilitating team-based data science: lessons learned from the DSC-WAV project

    Authors: Chelsey Legacy, Andrew Zieffler, Benjamin S. Baumer, Valerie Barr, Nicholas J. Horton

    Abstract: While coursework provides undergraduate data science students with some relevant analytic skills, many are not given the rich experiences with data and computing they need to be successful in the workplace. Additionally, students often have limited exposure to team-based data science and the principles and tools of collaboration that are encountered outside of school. In this paper, we describe th… ▽ More

    Submitted 21 October, 2021; v1 submitted 21 June, 2021; originally announced June 2021.

    MSC Class: 97K80; 97P99

  9. Integrating computing in the statistics and data science curriculum: Creative structures, novel skills and habits, and ways to teach computational thinking

    Authors: Nicholas J. Horton, Johanna S. Hardin

    Abstract: Nolan and Temple Lang (2010) argued for the fundamental role of computing in the statistics curriculum. In the intervening decade the statistics education community has acknowledged that computational skills are as important to statistics and data science practice as mathematics. There remains a notable gap, however, between our intentions and our actions. In this special issue of the *Journal of… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

    Comments: In press, Journal of Statistics and Data Science Education

  10. Implementing version control with Git and GitHub as a learning objective in statistics and data science courses

    Authors: Matthew D. Beckman, Mine Çetinkaya-Rundel, Nicholas J. Horton, Colin W. Rundel, Adam J. Sullivan, Maria Tackett

    Abstract: A version control system records changes to a file or set of files over time so that changes can be tracked and specific versions of a file can be recalled later. As such, it is an essential element of a reproducible workflow that deserves due consideration among the learning objectives of statistics and data science courses. This paper describes experiences and implementation decisions of four co… ▽ More

    Submitted 4 November, 2020; v1 submitted 7 January, 2020; originally announced January 2020.

    Comments: In press, Journal of Statistics and Data Science Education

  11. Data scraping, ingestation, and modeling: bringing data from cars.com into the intro stats class

    Authors: Sarah McDonald, Nicholas Jon Horton

    Abstract: New tools have made it much easier for students to develop skills to work with interesting data sets as they begin to extract meaning from data. To fully appreciate the statistical analysis cycle, students benefit from repeated experiences collecting, ingesting, wrangling, analyzing data and communicating results. How can we bring such opportunities into the classroom? We describe a classroom acti… ▽ More

    Submitted 9 September, 2018; originally announced September 2018.

    Comments: in press, CHANCE

  12. Greater data science at baccalaureate institutions

    Authors: Amelia McNamara, Nicholas J. Horton, Benjamin S. Baumer

    Abstract: Donoho's JCGS (in press) paper is a spirited call to action for statisticians, who he points out are losing ground in the field of data science by refusing to accept that data science is its own domain. (Or, at least, a domain that is becoming distinctly defined.) He calls on writings by John Tukey, Bill Cleveland, and Leo Breiman, among others, to remind us that statisticians have been dealing wi… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

    Comments: in press response to Donoho paper in Journal of Computational Graphics and Statistics

  13. Updated guidelines, updated curriculum: The GAISE College Report and introductory statistics for the modern student

    Authors: Beverly L. Wood, Megan Mocko, Michelle Everson, Nicholas J. Horton, Paul Velleman

    Abstract: Since the 2005 American Statistical Association's (ASA) endorsement of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) College Report, changes in the statistics field and statistics education have had a major impact on the teaching and learning of statistics. We now live in a world where "Statistics - the science of learning from data - is the fastest-growing science,… ▽ More

    Submitted 26 May, 2017; originally announced May 2017.

    Comments: in press, CHANCE

    MSC Class: 62-01

  14. arXiv:1705.08544  [pdf, other

    stat.OT

    Data Visualization on Day One: Bringing Big Ideas into Intro Stats Early and Often

    Authors: Xiaofei Wang, Cynthia Rush, Nicholas Jon Horton

    Abstract: In a world awash with data, the ability to think and compute with data has become an important skill for students in many fields. For that reason, inclusion of some level of statistical computing in many introductory-level courses has grown more common in recent years. Existing literature has documented multiple success stories of teaching statistics with R, bolstered by the capabilities of R Mark… ▽ More

    Submitted 23 May, 2017; originally announced May 2017.

    Comments: Accepted in Technology Innovations in Statistics Education

    Journal ref: Technology Innovations in Statistics Education, 10(1) (2017). https://escholarship.org/uc/item/84v3774z

  15. A mean score method for sensitivity analysis to departures from the missing at random assumption in randomised trials

    Authors: Ian R. White, James Carpenter, Nicholas J. Horton

    Abstract: Most analyses of randomised trials with incomplete outcomes make untestable assumptions and should therefore be subjected to sensitivity analyses. However, methods for sensitivity analyses are not widely used. We propose a mean score approach for exploring global sensitivity to departures from missing at random or other assumptions about incomplete outcome data in a randomised trial. We assume a s… ▽ More

    Submitted 2 May, 2017; originally announced May 2017.

    Comments: pre-publication (author version) in press, Statistica Sinica

    MSC Class: 62

  16. Enriching students' conceptual understanding of confidence intervals: An interactive trivia-based classroom activity

    Authors: Xiaofei Wang, Nicholas G. Reich, Nicholas J. Horton

    Abstract: Confidence intervals provide a way to determine plausible values for a population parameter. They are omnipresent in research articles involving statistical analyses. Appropriately, a key statistical literacy learning objective is the ability to interpret and understand confidence intervals in a wide range of settings. As instructors, we devote a considerable amount of time and effort to ensure th… ▽ More

    Submitted 29 January, 2017; originally announced January 2017.

    Comments: 18 pages; in press at the American Statistician

    MSC Class: 62-01

  17. Using a "Study of Studies" to help statistics students assess research findings

    Authors: Azka Javaid, Xiaofei Wang, Nicholas J Horton

    Abstract: One learning goal of the introductory statistics course is to develop the ability to make sense of research findings in published papers. The Atlantic magazine regularly publishes a feature called "Study of Studies" that summarizes multiple articles published in a particular domain. We describe a classroom activity to develop this capacity using the "Study of Studies." In this activity, students r… ▽ More

    Submitted 29 January, 2017; originally announced January 2017.

    Comments: in press, CHANCE

    MSC Class: 62.01

  18. Challenges and opportunities for statistics and statistical education: looking back, looking forward

    Authors: Nicholas Jon Horton

    Abstract: The 175th anniversary of the ASA provides an opportunity to look back into the past and peer into the future. What led our forebears to found the association? What commonalities do we still see? What insights might we glean from their experiences and observations? I will use the anniversary as a chance to reflect on where we are now and where we are headed in terms of statistical education amidst… ▽ More

    Submitted 28 April, 2015; v1 submitted 7 March, 2015; originally announced March 2015.

    Comments: In press: The American Statistician

  19. arXiv:1502.00318  [pdf, other

    stat.CO cs.CY stat.OT

    Setting the stage for data science: integration of data management skills in introductory and second courses in statistics

    Authors: Nicholas J. Horton, Benjamin S. Baumer, Hadley Wickham

    Abstract: Many have argued that statistics students need additional facility to express statistical computations. By introducing students to commonplace tools for data management, visualization, and reproducible analysis in data science and applying these to real-world scenarios, we prepare them to think statistically. In an era of increasingly big data, it is imperative that students develop data-related c… ▽ More

    Submitted 1 February, 2015; originally announced February 2015.

    MSC Class: 62-01

  20. Data Science in Statistics Curricula: Preparing Students to "Think with Data"

    Authors: Johanna Hardin, Roger Hoerl, Nicholas J. Horton, Deborah Nolan

    Abstract: A growing number of students are completing undergraduate degrees in statistics and entering the workforce as data analysts. In these positions, they are expected to understand how to utilize databases and other data warehouses, scrape data from Internet sources, program solutions to complex problems in multiple languages, and think algorithmically as well as statistically. These data science topi… ▽ More

    Submitted 4 August, 2015; v1 submitted 12 October, 2014; originally announced October 2014.

  21. R Markdown: Integrating A Reproducible Analysis Tool into Introductory Statistics

    Authors: Ben Baumer, Mine Cetinkaya-Rundel, Andrew Bray, Linda Loi, Nicholas J. Horton

    Abstract: Nolan and Temple Lang argue that "the ability to express statistical computations is an essential skill." A key related capacity is the ability to conduct and present data analysis in a way that another person can understand and replicate. The copy-and-paste workflow that is an artifact of antiquated user-interface design makes reproducibility of statistical analysis more difficult, especially as… ▽ More

    Submitted 8 February, 2014; originally announced February 2014.

    Comments: 20 pages, plus a 10 page appendix

    MSC Class: 62-01

    Journal ref: Technology Innovations in Statistics Education, 8(1), 2014

  22. arXiv:1401.3269  [pdf

    stat.CO cs.CY stat.OT

    Teaching precursors to data science in introductory and second courses in statistics

    Authors: Nicholas J Horton, Benjamin S Baumer, Hadley Wickham

    Abstract: Statistics students need to develop the capacity to make sense of the staggering amount of information collected in our increasingly data-centered world. Data science is an important part of modern statistics, but our introductory and second statistics courses often neglect this fact. This paper discusses ways to provide a practical foundation for students to learn to "compute with data" as define… ▽ More

    Submitted 14 January, 2014; originally announced January 2014.

    MSC Class: 62-07

  23. Adjusting models of ordered multinomial outcomes for nonignorable nonresponse in the occupational employment statistics survey

    Authors: Nicholas J. Horton, Daniell Toth, Polly Phipps

    Abstract: An establishment's average wage, computed from administrative wage data, has been found to be related to occupational wages. These occupational wages are a primary outcome variable for the Bureau of Labor Statistics Occupational Employment Statistics survey. Motivated by the fact that nonresponse in this survey is associated with average wage even after accounting for other establishment character… ▽ More

    Submitted 31 July, 2014; v1 submitted 3 January, 2014; originally announced January 2014.

    Comments: Published in at http://dx.doi.org/10.1214/14-AOAS714 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS714

    Journal ref: Annals of Applied Statistics 2014, Vol. 8, No. 2, 956-973

  24. I hear, I forget. I do, I understand: a modified Moore-method mathematical statistics course

    Authors: Nicholas Jon Horton

    Abstract: Moore introduced a method for graduate mathematics instruction that consisted primarily of individual student work on challenging proofs (Jones, 1977). Cohen (1982) described an adaptation with less explicit competition suitable for undergraduate students at a liberal arts college. This paper details an adaptation of this modified Moore-method to teach mathematical statistics, and describes ways t… ▽ More

    Submitted 28 September, 2013; originally announced September 2013.

    MSC Class: 62-01