Pandas joining and merging tables
I was trying to do a join based on two columns, one of which is a pd Timestamp
.
What I learned: If you’re trying to join/merge two DataFrames not by their indexes,
pandas.DataFrame.merge
is better (yay precise language) than
pandas.DataFrame.join.
Or, for some reason I had issues with df.join(.. by=[col1,col2])
, even with df.set_index([col1,col2]).join(df2.set_index...)
, then it went out of memory and I gave up.
Then a SO answer1 said
use merge if you are not joining on the index
I tried it and df.merge(..., by=col2)
magically worked!
Nel mezzo del deserto posso dire tutto quello che voglio.
comments powered by Disqus