-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Enable append mode
to JSON lines
#35849
ENH: Enable append mode
to JSON lines
#35849
Comments
How much code is required to do the append yourself? Is it not just a case of passing an open file handle to an existing json file? |
The parsing part is not much extra code. The part I struggled when creating an external module was that pandas can read/write multiple compression formats on the fly and I would need to re-implement case by case if I wanted to have support for the same formats. It was a natural choice to improve pandas instead. But... The point here is not that I have a specific scenario and want to push it to pandas, but instead, I believe It's a missing capability of pandas for this specific (already supported) format. JSON lines should have an append mode for the same reasons append mode makes sense for CSV files. The format was created with easy appends as a feature. E.g. A common scenario would be gathering data from batch script and append the changes to an existing file daily. |
append mode
on to_json
if orient='records' and lines=Trueappend mode
to JSON lines
@gfyoung do you know maintainers/contributors that might be interested in commenting/reviewing this change? As suggested, I created this issue just to raise discussion (as the PR is already done). But as not many comments were made, I'm trying to get some reviews here before conflicts can affect the PR and It becomes obsolete. |
Hi. Can I get an update on this issue being fixed ? |
@charizard-knows pandas is all volunteer issue will be fixed when a community member does a pull request pandas core will provide code review |
take |
@jreback I did a pull request for the code change. Looks like it failed a typing check but I can't see any information on why it failed. Do you have any insight on what the issue is? |
Is your feature request related to a problem?
JSON Lines format allow appending to an existing file in a easy way (new line with valid JSON string). Historically
DataFrame().to_json
didn't allowmode="a"
because It would introduce complications of reading/parsing/changing pure JSON strings. But for JSON lines It's done in an elegant way, as easy as a CSV files.The pandas way of using JSON lines is setting
orient='records'
together withlines=True
, but It lacks amode="a"
for append mode. My feature proposal (PR already done, just need review) is simple: include the capability of append mode (mode="a"
) onto_json
IForient='records'
andlines=True
Describe the solution you'd like
The PR: #35832
If I got It right, the solution is simple:
mode: str = "w"
onto_json
this will NOT break anything as the default behavior continues as write modemode="a"
, checking iforient='records'
andlines=True
, raising an Exception otherwisemode="a"
and the file already exists and is not empty, then add a new line to the JSON string (s = convert_to_line_delimits(s)
) before sending tohandler
get_handle(path_or_buf, mode, compression=compression)
instead of the hardcodedget_handle(path_or_buf, "w", compression=compression)
)The text was updated successfully, but these errors were encountered: