-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to_sql Out of bounds nanosecond timestamp #26761
Comments
Here is my attempt at reproducing the issue with two different versions of pandas. import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('mysql+pymysql://username:password@localhost/database?host=localhost?port=3306', echo=False)
df = pd.DataFrame({'DOB': {0: '26/1/2016 ', 2: '31/12/9999 '}})
df.DOB = pd.to_datetime(df.DOB) # This line throws the error
df.to_sql('users', con=engine, index=False, if_exists='replace') Here is what I got with pandas v0.24.2:
Key points:
@jreback @jorisvandenbossche Is this the expected behaviour or is this a bug? |
This is the expected behavior. pd.Timestamp supports datetimes from 1677-09-21 to 2262-04-11. Is 12/31/9999 a placeholder for missing data? If so, consider replacing it with "NaT", which |
Thanks for looking at this. I use It is impossible for me to replace out of range date with "NaT" because I can't tell which column stands for date from over 100 tables, and the year can be 9999 or other number greater than 2261. |
@gonghwa we need a reproducible example from you. http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports If you're able to use sqlite that's best since it's included with python. If you can't reproduce with sqlite then you'll need to provide the database info (database and SQLAlchemy driver you're using) |
Hi, Tom, Here is the code I can reproduce the error, but the issue only exists on mssql, not sqllite3. import pandas as pd df=pd.DataFrame({'date':datetime.datetime(9999,1,1)},index=[0]) With 0.23.4 it finished without error, with 0.24.2, it reports: OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 9999-01-01 00:00:00 Thank you. |
Do you see anywhere in to_sql where we convert object-dtype columns to
datetime?
…On Wed, Jun 12, 2019 at 9:47 PM Hua Gong ***@***.***> wrote:
Hi, Tom,
Here is the code I can reproduce the error, but the issue only exists on
mssql, not sqllite3.
import pandas as pd
import datetime
from sqlalchemy import create_engine
connapt=create_engine("mssql+pyodbc://SQLEXPRESSAPT")
df=pd.DataFrame({'date':datetime.datetime(9999,1,1)},index=[0])
df.to_sql('pd024',connapt,index=False,if_exists="replace")
With 0.23.4 it finished without error, with 0.24.2, it reports:
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 9999-01-01 00:00:00
Thank you.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#26761?email_source=notifications&email_token=AAKAOIXRENITHPP7GRQYVC3P2GYMXA5CNFSM4HWQNGM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXSLITI#issuecomment-501527629>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIS3CX5X3ZJZQNX5WO3P2GYMXANCNFSM4HWQNGMQ>
.
|
I got time to look into details and find the error comes from the change in pandas\io\sql.py, row 877:
In pandas 0.23 it is:
col.dt.tz returns the error. Shall we add one more except to handle OutOfBoundsDatetime:
|
Do you have a small example with data where the |
The error msg says: so I guess the attribute comes from calling the function objects_to_datetime64ns, which has the limitation of: supports datetimes from 1677-09-21 to 2262-04-11 |
Thanks for finding the root issue. That definitely appears like a bug. Since that specific branch is really just checking for potential timezones in the date schema columns (which PR's always welcome. |
My pleasure. I found that date out of the period from 1677-09-21 to 2262-04-11 with timezone will raise another exception: ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True
So in conclusion, when date is out of boundary: |
hi.. Error while fetching data from PostgreSQL Out of bounds nanosecond timestamp: 1-03-22 00:00:00 |
I am not sure whether it is fixed in new release or not because I haven't use same data in pandas for long time. Try to downgrade pandas to 0.23.4. If the error persists you have to find the root cause by yourself. |
Code Sample, a copy-pastable example if possible
Problem description
I upgraded my pandas to 0.24.2 recently and encountered below error, which will not pop before.
When I am using python to transfer some data to sql server. Some date is set to 9999-12-31 and it will cause error: OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 9999-12-31 00:00:00
Expected Output
We should be able to transfer those date. Thanks
Output of
pd.show_versions()
[paste the output of
pd.show_versions()
here below this line]INSTALLED VERSIONS
commit: None
python: 3.7.0.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 69 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.24.2
pytest: 3.8.0
pip: 19.0.2
setuptools: 40.2.0
Cython: 0.28.5
numpy: 1.16.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: 1.7.9
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 2.2.3
openpyxl: 2.5.6
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.1.0
lxml.etree: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.11
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
The text was updated successfully, but these errors were encountered: