rOpenSci | Blog

All posts (Page 60 of 80)

finch - parse Darwin Core files

finch has just been released to CRAN (binaries should be up soon). finch is a package to parse Darwin Core files. Darwin Core (DwC) is: a body of standards. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing reference definitions, examples, and commentaries. The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, samples, and related information....

Highlights and Resources from Community Call v12: How do I create a code of conduct for my event/lab/codebase?

Our Community Call on December 15th covered a big topic in tech communities: “How do I create a code of conduct for my event/lab/codebase?”. Here, we cover some of the key themes and considerations that arose from the discussion and point to curated resources and examples to follow when developing a code of conduct (CoC) for your community. Three guest speakers shared different perspectives. Dr Pauline Barmby talked about the process and lessons learned as Data Carpentry and Software Carpentry recently updated their CoC; Ms Safia Abdalla talked about “Codes of conduct for open source: the stuff no one tells you”; and Dr Titus Brown talked about his lab CoC....

Announcing our first fellowship awarded to Dr. Nick Golding

rOpenSci’s overarching mission is to promote a culture of transparent, open, and reproducible research across various scientific communities. All of our activities are geared towards lowering barriers to participation, and building a community of practitioners around the world. In addition to developing and maintaining a large suite of open source tools for data science, we actively support the research community with expert review on research software development, community calls, and hosting annual unconferences around the world....

Announcing pdftools 1.0

This week we released version 1.0 of the ropensci pdftools package to CRAN. Pdftools provides utilities for extracting text, fonts, attachments and other data from PDF files. It also supports rendering of PDF files into bitmap images. This release has a few internal enhancements and fixes an annoying bug for landscape PDF pages. The version bump to 1.0 signifies that the package has undergone sufficient testing and the API is stable....

Tesseract Update: Options and Languages

A few weeks ago we announced the first release of the tesseract package: a high quality OCR engine in R. We have now released an update with extra features. 🔗 Installing Training Data As explained in the first post, the tesseract system is powered by language specific training data. By default only English training data is installed. Version 1.3 adds utilities to make it easier to install additional training data....

Working together to push science forward

Happy rOpenSci users can be found at