Hands-on with Research Data Management in Chemistry
ChE-601
Linklist
This page is part of the content downloaded from Linklist on Wednesday, 25 December 2024, 15:43. Note that some content and any files larger than 50 MB are not downloaded.
Page content
Learning more
- The Turing Way: "open source community-driven guide to reproducible, ethical, inclusive and collaborative data science"
- Tips scientific code from the Chodera lab
- Here we start curate efforts about data in chemistry
- https://opensciencemooc.eu
Cookiecutters
Setting up a repository with one command:
- One from the Molecular Sciences Software Institute (MolSSI) for general scientific Python packages
- Still in development: Our cookiecutter to develop REST-APIs in Python
Note taking
- Markdown on GitHub or Hackmd
- Notion/Evernote (Evernote is not free but has great OCR capabilities, i.e., you can also search in Scans)
- Computational biology lab notebook rules
Testing code
- Pytest
- Test code with random inputs using hypothesis
Jupyter notebooks
- I do not like notebooks
- Response from the fast.ai developers: I like notebooks
- Run them in the cloud using Google Colab or MyBinder, one example of an ML tutorial that runs on Colab and/or binder.
Reproducible ML
- comet.ml, wandb, Renku,
- A recent symposium
- There are more and more reproducibility challenges, they are also a great way to get started with ML
Reproducible Code Environment
- Conda
- Docker
- Soon to come: GitHub codespace
- Overview of tools for Python
FAIR data
- The FAIR paper: Wilkinson, M. D.; Dumontier, M.; Aalbersberg, Ij. J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Silva Santos, L. B.; Bourne, P. E.; Bouwman, J.; Brookes, A. J.; Clark, T.; Crosas, M.; Dillo, I.; Dumon, O.; Edmunds, S.; Evelo, C. T.; Finkers, R.; Gonzalez-Beltran, A.; Gray, A. J. G.; Groth, P.; Goble, C.; Grethe, J. S.; Heringa, J.; ’t Hoen, P. A. C.; Hooft, R.; Kuhn, T.; Kok, R.; Kok, J.; Lusher, S. J.; Martone, M. E.; Mons, A.; Packer, A. L.; Persson, B.; Rocca-Serra, P.; Roos, M.; van Schaik, R.; Sansone, S.-A.; Schultes, E.; Sengstag, T.; Slater, T.; Strawn, G.; Swertz, M. A.; Thompson, M.; van der Lei, J.; van Mulligen, E.; Velterop, J.; Waagmeester, A.; Wittenburg, P.; Wolstencroft, K.; Zhao, J.; Mons, B. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Scientific Data 2016, 3, 160018. https://doi.org/10.1038/sdata.2016.18.
- https://ropensci.github.io/reproducibility-guide/
- Ten Simple Rules for the Care and Feeding of Scientific Data https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003542
- Asses if you share your data in a FAIR way https://www.ands-nectar-rds.org.au/fair-tool
- About persistent identifiers: https://www.ands.org.au/guides/persistent-identifiers-awareness
- Cost of not being FAIR https://moodle.epfl.ch/draftfile.php/1792632/user/draft/576426481/KI0219023ENN.en.pdf
- Piwowar, H. A.; Vision, T. J. Data Reuse and the Open Data Citation Advantage. PeerJ 2013, 1, e175. https://doi.org/10.7717/peerj.175.
- McKiernan, E. C.; Bourne, P. E.; Brown, C. T.; Buck, S.; Kenall, A.; Lin, J.; McDougall, D.; Nosek, B. A.; Ram, K.; Soderberg, C. K.; Spies, J. R.; Thaney, K.; Updegrove, A.; Woo, K. H.; Yarkoni, T. How Open Science Helps Researchers Succeed. eLife 2016, 5, e16800. https://doi.org/10.7554/eLife.16800.
- Woelfle, M.; Olliaro, P.; Todd, M. H. Open Science Is a Research Accelerator. Nature Chemistry 2011, 3 (10), 745–748. https://doi.org/10.1038/nchem.1149.
- Davies, I. W. The Digitization of Organic Synthesis. Nature 2019, 570 (7760), 175–181. https://doi.org/10.1038/s41586-019-1288-y.
- A special issue about the FAIR principle: https://www.mdpi.com/journal/publications/special_issues/fair
- Tennant, J. P.; Waldner, F.; Jacques, D. C.; Masuzzo, P.; Collister, L. B.; Hartgerink, Chris. H. J. The Academic, Economic and Societal Impacts of Open Access: An Evidence-Based Review. F1000Res 2016, 5,632. https://doi.org/10.12688/f1000research.8460.3.
- Data Licensing: http://www.dcc.ac.uk/resources/how-guides/license-research-data
- Code Licensing: http://kbroman.org/steps2rr/pages/licenses.html and https://opensource.org/licenses, and choosealicense
Publishing
Version Control/Git