In the middle of the desert you can say anything you want
mwouts/itables: Pandas DataFrames as Interactive DataTables:
from itables import init_notebook_mode
init_notebook_mode(all_interactive=True)
It kept “loading”. I set the notebook to ’trusted’ as per help, didn’t help.
But this did:
init_notebook_mode(all_interactive=True, connected=True)
(connected=True
makes it load libraries dynamically from the internet, and is not the default)
Allows more interesting interactive dynamical tables. Incl. things like sorting by column etc.
Related: 230529-1413 Plants datasets taxonomy
Citizen science (similar to [..] participatory/volunteer monitoring) is scientific research conducted with participation from the general public
most citizen science research publications being in the fields of biology and conservation
can mean multiple things, usually using citizens acting volunteers to help monitor/classify/.. stuff (but also citizens initiating stuff; also: educating the public about scientific methods, e.g. schools)
allowed users to upload photos of a plant species and its components, enter its characteristics (such as color and size), compare it against a catalog photo and classify it. The classification results are juried by crowdsourced ratings.4
“Here we present two Pl@ntNet citizen science initiatives used by conservation practitioners in Europe (France) and Africa (Kenya).”
@fuccilloAssessingAccuracyCitizen2015
(2015) z>
Volunteers demonstrated greatest overall accuracy identifying unfolded leaves, ripe fruits, and open flowers.
@crallAssessingCitizenScience2011
Assessing citizen science data quality (2011) z>
@chenPerformanceEvaluationDeep2021
(2021) z>
- Georeferenced plant observations from herbarium, plot, and trait records;
- Plot inventories and surveys;
- Species geographic distribution maps;
- Plant traits;
- A species-level phylogeny for all plants in the New World;
- Cross-continent, continent, and country-level species lists.
@ortizReviewInteractionsBiodiversity2021
A review of the interactions between biodiversity, agriculture, climate change, and international trade (2021) z/d>(e.g. strong colour variation and the transformation of 3D objects after pressing like fruits and flowers) <
@waldchenMachineLearningImage2018
(2018) z>
@goeau2021overview
(2021) z><@goeauAIbasedIdentificationPlant2021
(2021) z>“Lab-based setting is often used by biologist that brings the specimen (e.g. insects or plants) to the lab for inspecting them, to identify them and mostly to archive them. In this setting, the image acquisition can be controlled and standardised. In contrast to field-based investigations, where images of the specimen are taken in-situ without a controllable capturing procedure and system. For fieldbased investigations, typically a mobile device or camera is used for image acquisition and the specimen is alive when taking the picture (Martineau et al., 2017). ”<
@waldchenMachineLearningImage2018
(2018) z>
@pearseDeepLearningPhenology2021
(2021) z>, but there DL failed less without flowers than non-DL), but sometimes don’t@walkerHarnessingLargeScaleHerbarium2022
(2022) z/d>
@goodwinWidespreadMistakenIdentity2015
(2015) z/d>EDIT separate post about this: 230529-1413 Plants datasets taxonomy
We can classify existing datasets in two types:
@giselssonPublicImageDatabase2017
(2017) z>), common weeds in Denmark dataset <@leminenmadsenOpenPlantPhenotype2020
(2020) z/d> etc.
FloraCapture requests contributors to photograph plants from at least five precisely defined perspectives
There are some special datasets, satellite and whatever, but especially:
@mamatAdvancedTechnologyAgriculture2022
Advanced Technology in Agriculture Industry by Implementing Image Annotation Technique and Deep Learning Approach (2022) z/d> has an excellent overview of these)Additional info present in datasets or useful:
http://ceur-ws.org/Vol-2936/paper-122.pdf / <@goeau2021overview
(2021) z> ↩︎
https://hal-lirmm.ccsd.cnrs.fr/lirmm-03793591/file/paper-153.pdf / <@goeau2022overview
(2022) z> ↩︎
IBM and SAP open up big data platforms for citizen science | Guardian sustainable business | The Guardian ↩︎
Deep Learning with Taxonomic Loss for Plant Identification - PMC ↩︎
Goal: Interact with Zotero from within Obsidian
Solution: “Citations”1 plugin for Obsidian, “Better Bibtex”2 plugin for Zotero!
Neat bits:
There’s a configurable “Citations: Insert Markdown Citation” thing!
<_`@{{citekey}}` {{titleShort}} ({{year}}) [z]({{zoteroSelectURI}})/[d](https://doi.org/{{DOI}})_>
- {{citekey}}
- {{abstract}}
- {{authorString}}
- {{containerTitle}}
- {{DOI}}
- {{eprint}}
- {{eprinttype}}
- {{eventPlace}}
- {{page}}
- {{publisher}}
- {{publisherPlace}}
- {{title}}
- {{titleShort}}
- {{URL}}
- {{year}}
- {{zoteroSelectURI}}
hans/obsidian-citation-plugin: Obsidian plugin which integrates your academic reference manager with the Obsidian editor. Search your references from within Obsidian and automatically create and reference literature notes for papers and books. ↩︎
retorquere/zotero-better-bibtex: Make Zotero effective for us LaTeX holdouts ↩︎
<C-N>
for me.zotero://
links don’t work for me, and the default .desktop file they provide seems broken - TODO laterGitstats is the best I know: tomgi/git_stats: GitStats is a git repository statistics generator.
gitstats /path/to/repo /path/to/output/dir
Generates comprehensive static html reports with graphs. Authors, files, times of the day/week/month, ….
4. More Control Flow Tools — Python 3.10.11 documentation:
def http_error(status):
match status:
case 400:
return "Bad request"
case 404:
return "Not found"
case 418:
return "I'm a teapot"
case _:
return "Something's wrong with the internet"
Also
case 401 | 403 | 404:
return "Not allowed"
and
match points:
case []:
print("No points")
case [Point(0, 0)]:
print("The origin")
case [Point(x, y)]:
print(f"Single point {x}, {y}")
case [Point(0, y1), Point(0, y2)]:
print(f"Two on the Y axis at {y1}, {y2}")
case _:
print("Something else")
Lastly, you can capture subpatterns:
case (Point(x1, y1), Point(x2, y2) as p2): ...
Generally - #todo - I should systematically read up on new things in the not-latest-anymore Python versions, e.g.:
TIL Pycharm can automatically reformat files, incl. things like json. The action is “Reformat file”, on my install <C-S-a-L>
If not all files are seen in pycharm project view:
A typo in a keybinding randomly led me to the graph view in Obsidian, never thought about it - but now apparently I have a lot of notes and it’s quite pretty!
I wanted to remove the #zc
tag from graph view to make it clearer (since ALL notes have it basically.)
(177) How to hide tags, but keep notes with them in graph : ObsidianMD mentioned a way to do just that, though I’m not sure I understand it:
-(-path:folder (#tag1 OR #tag2 OR #tag3))
For me that’s:
-(-path:garden/it (#zc OR #zc/it))
# Display all columns and rows:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
# Don't truncate values
pd.set_option('display.max_colwidth', None)
This of course works:
with pd.option_context('display.max_colwidth', None):
display(df)
Make cells 100% wide in Jupyter:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
And https://stackoverflow.com/a/51593236 has this function remarkably similar to the old one I’ve had, except that I changed print->display:
def print_full(x):
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 2000)
pd.set_option('display.float_format', '{:20,.2f}'.format)
pd.set_option('display.max_colwidth', None)
#print(x)
display(x)
pd.reset_option('display.max_rows')
pd.reset_option('display.max_columns')
pd.reset_option('display.width')
pd.reset_option('display.float_format')
pd.reset_option('display.max_colwidth')
Pandas convert column to categorial
pd.row_name.astype('category')
Pandas select numeric columns1:
ds.select_dtypes(include=[np.number])
Pandas divide columns by other column2:
(ds.T / ds.col2.T).T
python - Divide multiple columns by another column in pandas - Stack Overflow
3D plotting in matplotlib: Three-Dimensional Plotting in Matplotlib | Python Data Science Handbook & the official docu: 3D plotting — Matplotlib 3.7.1 documentation
Really nice relevant tutorial: How to handle time series data with ease — pandas 2.1.0.dev0+658.gc9de03effb documentation
sns.boxplot(data=s_dsm_conv, y='Dauer', x='Parameter')
> TypeError: Neither the `x` nor `y` variable appears to be numeric.
pd.TimeDelta
is indeed not numeric, but can be made one through
s_dsm_conv['Dauer'] = s_dsm_conv['Dauer'].astype('timedelta64[h]')
# Gaps longer than one day
real_gaps=gaps[gaps>pd.Timedelta(1,"d")]