Brian Nosek recently shared the "top" OSF-hosted preprints of 2017. While I don't believe article metrics have any correlation to quality or importance, it is interesting to see what articles have grabbed the attention of the masses. In addition, there is a narrative out there that preprints are not read, so it is nice to see that there are a lot of preprints that are read more than most journal articles. The data below is accurate as of December 21st, 2017.

Title Views Downloads
Gender differences and bias in open source: Pull request acceptance of women versus men 98798 61528
Sci-Hub provides access to nearly all scholarly literature 43818 13746
Statistical heartburn: An attempt to digest four pizza publications from the Cornell Food and Brand Lab 24626 6365
The role of Twitter in the life cycle of a scientific publication 17769 6441
The State of OA: A large-scale analysis of the prevalence and impact of Open Access articles 17522 6250
Software development: do good manners matter? 16458 1620
An empirical study of goto in C code 15628 8345
Zika virus impairs growth in human neurospheres and brain organoids 14357 3712
Nine simple ways to make it easier to (re)use your data 13797 5743
The GRIM test: A simple technique detects numerous anomalies in the reporting of results in psychology 10613 2186
Forecasting at scale 9374 5664
CATALISE: a multinational and multidisciplinary Delphi consensus study of problems with language development. Phase 2. Terminology 9183 4735
Publiphilia Impactfactorius: a new psychiatric syndrome among biomedical scientists? 9166 764
The neck of Barosaurus: longer, wider and weirder than those of Diplodocus and other diplodocines 7538 1292
Effects of library size variance, sparsity, and compositionality on the analysis of microbiome data 6700 3304
Manipulating the alpha level cannot cure significance testing – comments on "Redefine statistical significance" 6173 1970
What is the Truck Factor of popular GitHub applications? A first assessment 5985 9212
Testing the hypothesis that routine sea ice coverage of 3-5 mkm2 results in a greater than 30% decline in population size of polar bears (Ursus maritimus) 5951 1701
scikit-image: Image processing in Python 5833 8789
The blue-green eggs of dinosaurs: How fossil metabolites provide insights into the evolution of bird reproduction 5630 1583
The prehistory of biology preprints: a forgotten experiment from the 1960s 5462 399
A surge of p-values between 0.041 and 0.049 in recent decades (but negative results are increasing rapidly too) 5189 27630
Excuse me, do you have a moment to talk about version control? 5189 1335
Opinionated analysis development 5141 1099
Evidence based pathways to intervention for children with language disorders 5114 1630
Do you speak open science? Resources and tips to learn the language 5096 708
Data organization in spreadsheets 4970 1359
Combined proteome and transcriptome analyses reveal that Zika virus circulating in Brazil alters cell cycle and neurogenic programmes in human neurospheres 4803 1252
The earth is flat (p>0.05): Significance thresholds and the crisis of unreplicable research 4803 1067
Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value 4792 618
Making research data repositories visible: the registry 4755 1624
The four pillars of scholarly publishing: The future and a foundation 4671 1086
Is the low-cost EyeTribe eye tracker any good for research? 4616 10061
To crowdfund research, scientists must build an audience for their work 4587 3251
Achieving human and machine accessibility of cited data in scholarly publications 4542 1149
Draft sequencing and assembly of the genome of the world’s largest fish, the whale shark: Rhincodon typus Smith 1828 4284 1528
How to share data for collaboration 4262 847
Autonomous Sensory Meridian Response (ASMR): A flow-like mental state 4192 2398
Declutter your R workflow with tidy tools 4169 1390
Research groups: how big should they be? 4121 1216
Lack of quantitative training among early-career ecologists: a survey of the problem and potential solutions 4118 1690
Imagining the ‘open’ university: Sharing scholarship to improve research and education 4104 1326
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes 4085 2015
Dark Research: information content in many modern research papers is not easily discoverable online 3827 1158
A nested phylogenetic reconstruction approach provides scalable resolution in the eukaryotic Tree Of Life 3783 806
VSEARCH: a versatile open source tool for metagenomics 3771 470
Sentieon DNA pipeline for variant detection - Software-only solution, over 20× faster than GATK 3.3 with identical results 3758 1447
Angiosperm phylogeny poster (APP) – Flowering plant systematics, 2017 3680 1673
Lung cancer incidence decreases with elevation: evidence for oxygen as an inhaled carcinogen 3623 389
Revised geochronology, correlation, and dinosaur stratigraphic ranges of the Santonian-Maastrichtian (Late Cretaceous) formations of the Western Interior of North America 3533 462
The GRIMMER test: A method for testing the validity of reported measures of variability 3459 455
Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products 3438 603
Accessing biological data in R with semantic web technologies 3423 1382
CATALISE: a multinational and multidisciplinary Delphi consensus study. Identifying language impairments in children 3410 981
OncoRank: A pan-cancer method of combining survival correlations and its application to mRNAs, miRNAs, and lncRNAs 3373 447
Ten simple rules for digital data storage 3294 1519
Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies 3213 1167
Validation and development of COI metabarcoding primers for freshwater macroinvertebrate bioassessment 3161 1370
On the intrinsic sterility of 3D printing 3145 1584
The first reported ceratopsid dinosaur from eastern North America (Owl Creek Formation, Upper Cretaceous, Mississippi, USA) 3122 390
Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system 3104 4418
Preliminary evidence of a new microbial species capable of sustainable intracellular survival and transfer in mammalian cell lines 3097 2215
The democratization of data science education 3075 1140
The impact of Docker containers on the performance of genomic pipelines 3065 828
OSoMe: The IUNI observatory on social media 3057 444
Swarm: robust and fast clustering method for amplicon-based studies 3045 874
Swabs to genomes: a comprehensive workflow 2989 1668
Mining usage patterns for the Android API 2966 657
Best practice in mixed effects modelling and multi-model inference in ecology 2946 1177
Crossing the streams: a framework for streaming analysis of short DNA sequencing reads 2885 563
Cannabis chemovar classification: terpenes hyper-classes and targeted genetic markers for accurate discrimination of flavours and effects 2861 300
Reproducible and reusable research: Are journal data sharing policies meeting the mark? 2840 251
In silico identification of off-target pesticidal dsRNA binding in honey bees (Apis mellifera) 2808 86
The Tao of Open Science for Ecology 2762 860
How R helps Airbnb make the most of its data 2735 13178
Characterization of RNA in exosomes secreted by human breast cancer cell lines using next-generation sequencing 2731 1022
A linear sigmoid colon passage method by left twisting 2727 315
The furculae of the dromaeosaurid dinosaur Dakotaraptor steini are trionychid turtle entoplastra 2720 1091
True´s beaked whale (Mesoplodon mirus) in Macaronesia 2709 339
GenGIS 2: Geospatial analysis of traditional and genetic biodiversity, with new gradient algorithms and an extensible plugin framework 2693 664
GroopM: An automated tool for the recovery of population genomes from related metagenomes 2681 829
Sequencing at sea: Challenges and experiences in Ion Torrent PGM sequencing during the 2013 Southern Line Islands Research Expedition 2660 368
Ten simple rules for considering preprints 2642 438
Anvi’o: An advanced analysis and visualization platform for ‘omics data 2616 1626
Teaching stats for data science 2616 719
Tiling the genome into consistently named subsequences enables precision medicine and machine learning with millions of complex individual data-sets 2615 603
A mathematical theory of knowledge, science, bias and pseudoscience 2598 638
mockrobiota: a public resource for microbiome bioinformatics benchmarking 2582 333
A standardized, extensible framework for optimizing classification improves marker-gene taxonomic assignments 2570 1065
Wrangling categorical data in R 2534 829
Packaging data analytical work reproducibly using R (and friends) 2513 712
A linear sigmoid colon passage method in colonoscopy 2507 704
Consistent, comprehensive and computationally efficient OTU definitions 2501 1128
Counterinsurgency Doctrine Applied to Infectious Disease 2498 1391
Relative toxicity of herbicide use in the United States 1990 to 2015 2489 383
Sympathy for the devil: a conservation strategy for devil and manta rays 2483 808
A world of opportunities with nanopore sequencing 2477 1395
BALSA: Integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU 2466 810
Spider phylogenomics: untangling the Spider Tree of Life 2451 793
Can editors save peer review from peer reviewers? 2418 235
Why respiratory viruses or bacteria have the highest probability to be deposited in the respiratory tract in flu seasons 2390 228
Were the necks of Apatosaurus and Brontosaurus adapted for combat? 2383 258
LecoS - A QGIS plugin for automated landscape ecology analysis 2361 5597
An effective rectal administration method in the left semiprone position 2360 158
Zeta potential of bacterial cells: Effect of wash buffers 2336 1228
Reevaluation of the largest published morphological data matrix for phylogenetic analysis of Paleozoic limbed vertebrates 2310 793
­Geometric morphometric analysis of snout shape in extant ruminants (Ungulata, Artiodactyla) 2265 606
Software Citation Principles 2259 388
Angiosperm Phylogeny Group (APG) in jeopardy – Where have the flowers gone? 2249 3541
Formalized synthesis opportunities for ecology: systematic reviews and meta-analyses 2241 505
Women are underrepresented on the editorial boards of journals in environmental biology and natural resource management 2241 240
The Chinese colossus: an evaluation of the phylogeny of Ruyangosaurus giganteus and its implications for titanosaur evolution 2236 514
The fragile legacy of Amphicoelias fragillimus (Dinosauria: Sauropoda; Morrison Formation - Latest Jurassic) 2234 2301
ANGSD-wrapper: utilities for analyzing next generation sequencing data 2203 841
Enzyme characterisation and kinetic modelling of the pentose phosphate pathway in yeast 2185 1625
A macroecological theory of microbial biodiversity 2175 890
What’s for dinner?: Undescribed species of porcini in a commercial packet 2165 710
Harmine stimulates neurogenesis of human neural cells in vitro 2143 782
rotl, an R package to interact with the Open Tree of Life data 2142 478
Signatures of rapid evolution in urban and rural transcriptomes of white-footed mice (Peromyscus leucopus) in the New York metropolitan area 2137 605
On the relationship between native grouper and invasive lionfish in the Caribbean 2122 2132
Zika virus: A newly emergent vector-borne public health threat in the Americas 2111 634
BioNames: linking taxonomy, texts, and trees 2041 450
A preliminary case study of the effect of shoe-wearing on the biomechanics of a horse’s foot 2034 423
The science behind Smarter Lunchrooms 2033 393
A survey of secure middleware for the Internet of Things 2028 3528
A method for simultaneous measurement of soil bacterial abundances and community composition via 16S rRNA gene sequencing 2026 1532
DICOM for quantitative imaging biomarker development: A standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research 2019 328
How to critically read ecological meta-analyses. 2010 1238
OncoLnc: Linking TCGA survival data to mRNAs, miRNAs, and lncRNAs 2008 323
Treating patients with compounds or biological agents that pass the safety-test 1993 75
Extending R with C++: A Brief Introduction to Rcpp 1984 674
The Shape of Color: Retinal Cones and Spectral Dispersion 1982 477
Evidence points out that "Nanotyrannus" is a juvenile Tyrannosaurus rex 1966 957
Setting the record straight on invasive lionfish control: Culling works 1964 1290
Good Practice for Conference Abstracts and Presentations: GP-CAP 1962 893
Hotspot analysis: a first prototype Python plugin enabling exploratory spatial data analysis into QGIS 1952 593
Male aggressiveness as intrasexual contest competition in 78 societies. 1950 175
SUPERSMART: ecology and evolution in the era of big data 1947 937
Multitrophic functional diversity predicts ecosystem functioning in experimental assemblages of estuarine consumers 1918 1544
Minimum soil nutrient guidelines for turfgrass developed from Mehlich 3 soil test results 1905 653
The enveomics collection: a toolbox for specialized analyses of microbial genomes and metagenomes 1899 571
Reef growth and limestone erosion 1876 196
New insights into the lifestyle of Allosaurus (Dinosauria: Theropoda) based on another specimen with multiple pathologies 1834 508
SymPy: Symbolic computing in Python 1831 1126
Coral reef baselines: how much macroalgae is natural? 1824 994
A genetic manipulation of motor neuron excitability does not alter locomotor output in Drosophila larvae 1817 544
CauseMap: Fast inference of causality from complex time series 1813 889
What do programmers know about the energy consumption of software? 1813 791
Common mistakes in data presentation and statistical analysis: how can the BioStat Decision Tool help? 1801 1876
Morphological evidence supports Dryolestoid affinities for the living Australian marsupial mole Notoryctes 1764 1199
Quantifying the effect of intervertebral cartilage on neutral posture in the necks of sauropod dinosaurs 1758 380
The pupillary light response reflects exogenous attention and inhibition of return 1756 2059
A review of KDD99 dataset usage in intrusion detection and machine learning between 2010 and 2015 1740 2346
Interoperability and FAIRness through a novel combination of Web technologies 1735 321
Non invasive analysis of metabolic changes following nutrient input into diverse fish species, as investigated by metabolic and microbial profiling approaches 1730 192
Oops, my tests broke the build: An analysis of Travis CI builds with GitHub 1720 2191
Restore: An R of sustainability that can tame the “conundrum” 1710 377
Randomised Badger Culling Trial: Impact, based on more extensive data 1704 295
Whale shark (Rhincodon typus) behavior: A multi-year analysis of individuals at Georgia Aquarium 1703 170
Conducting perception research over the Internet: a tutorial review 1692 447
Almost all known sauropod necks are incomplete and distorted 1689 578
Accelerating the XGBoost algorithm using GPU computing 1687 749
Genome-wide analyses reveal clustering in Cannabis cultivars: the ancient domestication trilogy of a panacea 1685 518
Ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses 1674 460
Runtime metric meets developer - Building better cloud applications using feedback 1671 1739
MetaCRAST: Reference-guided extraction of CRISPR spacers from unassembled metagenomes 1655 281
The tolerability of transcranial electrical stimulation used across extended periods in a naturalistic context by healthy individuals 1651 1916
Lessons from between the white lines for isolated data scientists 1644 420
Formatting Open Science: agilely creating multiple document formats for academic manuscripts with Pandoc Scholar 1629 325
Diversity is the question, not the answer 1627 359
Paleontology of the Bears Ears National Monument: history of exploration and designation of the monument 1615 276
A call for virtual experiments: accelerating the scientific process 1614 387
Preventing Zika virus infection during pregnancy by timing conception seasonally 1613 477
Advances and limits of using population genetics to understand local adaptation 1599 2475
A framework for smartphone-enabled, patient-generated health data analysis 1596 169
Marine biodiversity and ecosystem functioning: what’s known and what’s next? 1571 1358
Magic Strategies: the basic biology of multilevel, multiscale, health promotion 1571 744
Journal title abbreviations should be eliminated in the digital age 1570 716
Predicting trophic discrimination factor using Bayesian inference and phylogenetic, ecological and physiological data. DEsIR: Discrimination Estimation in R. 1554 399
Distributions of p-values smaller than .05 in Psychology: What is going on? 1538 592
When is an ecological network complex? Connectance drives degree distribution and emerging network properties 1538 535
Overload injuries in barefoot/minimal footwear running: evidence from crowd sourcing 1536 1145
Cancer as a script and possible implications on workings of genome 1530 398
Self-generated morphology in lagoon reefs 1528 250
If funders and libraries subscribed to open access: The case of eLife, PLOS, and BioOne 1519 597
Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding 1519 387
Culture-independent detection and characterisation of Mycobacterium tuberculosis and M. africanum in sputum samples using shotgun metagenomics on a benchtop sequencer 1519 378
DNA methylation marks inter-nucleosome linker regions throughout the human genome 1517 786
The man-machine integration era 1502 298
Development of a semi-defined medium for high cell density cultivation of Escherichia coli in shake flasks: Part 1 1497 356
A scaled-down workflow for Illumina shotgun sequencing library preparation: lower input and improved performance at small fraction of the cost 1494 472
Ethylene induces soil microbes to delay fruit ripening 1491 850
By-catch in a recreational fishery: an unmonitored source of mortality. 1486 313
Evaluating a lightweight transcriptome assembly pipeline on two closely related ascidian species 1479 837
Redescription of Arundel formation Ornithomimosaur material and a reinterpretation of Nedcolbertia justinhofmanni as an “Ostrich Dinosaur”: Biogeographic implications 1475 384
Free flow of sweat due to loss of surface tension at sweat droplets causes water-induced skin wrinkling 1463 1235
Assessing strengths and weaknesses of DNA metabarcoding based macroinvertebrate identification for routine stream monitoring 1458 576
Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass - sequence relationships with an innovative metabarcoding protocol 1455 719
Defending scientific integrity in conservation policy processes: lessons from Canada, Australia, and the United States 1443 101