Page MenuHomePhabricator

Automatically matching new Wikipedia articles with Wikidata items using Python - Task 2
Closed, ResolvedPublic

Description

This is the second task for T290718, Automatically matching new Wikipedia articles with Wikidata items using Python, aimed at getting you familiar with Pywikibot.

  1. You should register a Wikimedia account if you don't already have one. You can do so at https://www.wikidata.org/w/index.php?title=Special:CreateAccount
  1. You should download and install Python 3 and pywikibot - see instructions at https://www.mediawiki.org/wiki/Manual:Pywikibot/Installation . You will need to set up a user configuration file, and make sure you can log in. Set 'wikipedia' as the base family, and then manually add usernames['wikidata']['wikidata'] = u'Your username' into the generated user-config.py file. If you want, you can set a bot password (see https://www.mediawiki.org/wiki/Manual:Pywikibot/BotPasswords ) and use that - or use your normal Wikimedia login information.
  1. Set up a script that will connect to Wikidata, load the 'Outreachy_1' page that you created in the previous task, and print it out.
  1. Try adding 'Hello' to the end of the page you just loaded, and save it back.
  1. Load a Wikidata item (use 'Q4115189' to start with - it is the sandbox), and print out information from it. You could also try loading some of the other items you looked at in the first task.

Save your code to a repository, or create a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_2 (under your username)

Once you are happy, send me a link to your page (by email, on my talk page, or replying to this ticket as you prefer). Make sure to also register it as a contribution on the Outreachy website ( https://www.outreachy.org/outreachy-december-2021-internship-round/communities/wikimedia/automatically-matching-new-wikipedia-articles-with/contributions/ )!

Hints:

Event Timeline

Hi @Mike_Peel, I have been approved to participate in this year's outreachy contribution stage. Well, I have done the task. Hoping if you can you have a look at it and open for suggestions for improvement!
https://www.wikidata.org/wiki/User:Osamaahmed17/Outreachy-2

Hi @Mike_Peel, I have been approved to participate in this year's outreachy contribution stage. Well, I have done the task. Hoping if you can you have a look at it and open for suggestions for improvement!
https://www.wikidata.org/wiki/User:Osamaahmed17/Outreachy-2

Hi, nice start! I've asked a question on the talk page: https://www.wikidata.org/wiki/User_talk:Osamaahmed17/Outreachy-2

I have another question? Should I register it as a contribution on the Outreachy website before your approval? or I will do it after you confirm that it is perfect? @Mike_Peel

@Nafiya_Ahmed: Please ask general Outreachy process questions in general Outreachy support places. See https://www.mediawiki.org/wiki/Outreachy/Participants - thanks!

Sir @Mike_Peel , i am getting issue while Configuring Pywikibot. when i substitute the location of Pywikibot using command cd "path" it shows the following error:

cd : Cannot find path 'D:\Users\varda\AppData\Local\Programs\Python\Python39\Lib\site-packages\pywikibot' because it does not exist.
At line:1 char:1

@Veeshah
are you trying to install pywikibot?
have you downloaded the "core-stable" zip file(the file that has the pywikibot folder?

@Mike_Peel Hello, I was having some serious problem setting up pywikibot in my PC. I use Ubantu. I was having problem to log in. Anyways I started using paws, will it be a problem? Then I am now in a fix what to do now, as i told you before I forgot a lot of things in python. you said to set up a script that will connect to Wikidata, so i guess i ran import pywikibot
enwiki = pywikibot.Site('en', 'wikipedia')
enwiki_repo = enwiki.data_repository()
page = pywikibot.Page(enwiki_repo, 'User:Nafiya_Ahmed/Outreachy 1'),, i guess this is the code to connect, but when I ran print(page.text) my output is blank, I am really in a fix that what should be the appropiate output?

@Mike_Peel Hello, I was having some serious problem setting up pywikibot in my PC. I use Ubantu. I was having problem to log in. Anyways I started using paws, will it be a problem? Then I am now in a fix what to do now, as i told you before I forgot a lot of things in python. you said to set up a script that will connect to Wikidata, so i guess i ran import pywikibot
enwiki = pywikibot.Site('en', 'wikipedia')
enwiki_repo = enwiki.data_repository()
page = pywikibot.Page(enwiki_repo, 'User:Nafiya_Ahmed/Outreachy 1'),, i guess this is the code to connect, but when I ran print(page.text) my output is blank, I am really in a fix that what should be the appropiate output?

@Nafiya_Ahmed , did you check the response with print (page)?

@Veeshah
are you trying to install pywikibot?
have you downloaded the "core-stable" zip file(the file that has the pywikibot folder?

@Caseyy0000 yes i am trying to follow all the steps mentioned at https://www.mediawiki.org/wiki/Manual:Pywikibot/Installation
and i am stuck at the configure pywikibot part.

@Veeshah
I faved similar issies while congiguring pywikibot on "python3". Perhaps, ypu should use Anaconda and ensure you "cd" to pywikibot directory and update your "pip" before continuing.

@Osamaahmed17 the response is now [[wikidata:User:Nafiya-Ahmed/Outreachy 1]]
is it okay? or they should print out the entire page text?

@Osamaahmed17 i noticed whatever i m typing in page = pywikibot.Page(enwiki_repo, 'User:Nafiya-Ahmed/Outreachy 1') it is printing that, i m in a fix what is exactly happening after using this command.

@Mike_Peel Hello, I was having some serious problem setting up pywikibot in my PC. I use Ubantu. I was having problem to log in. Anyways I started using paws, will it be a problem? Then I am now in a fix what to do now, as i told you before I forgot a lot of things in python. you said to set up a script that will connect to Wikidata, so i guess i ran import pywikibot
enwiki = pywikibot.Site('en', 'wikipedia')
enwiki_repo = enwiki.data_repository()
page = pywikibot.Page(enwiki_repo, 'User:Nafiya_Ahmed/Outreachy 1'),, i guess this is the code to connect, but when I ran print(page.text) my output is blank, I am really in a fix that what should be the appropiate output?

In case you're still having issues with logging in, this is what worked for me.

Instead of pip installing pywikibot, manually download the latest version from the installation info site. Then follow the instructions here
https://www.mediawiki.org/wiki/Manual:Pywikibot/login.py
to create a user-config.py in the core_stable directory, or modify it if it already exists. Then create a .py file for the password in the specified format, this should contain your Wikidata account password. Create a .ipynb/.py file in this directory and run

pywikibot.Site().login()

The pip installed pywikibot needs some other files for those instructions, but using the full download worked.

Hi, sorry for the slow reply here.

@Mike_Peel Hello, I was having some serious problem setting up pywikibot in my PC. I use Ubantu. I was having problem to log in. Anyways I started using paws, will it be a problem? Then I am now in a fix what to do now, as i told you before I forgot a lot of things in python. you said to set up a script that will connect to Wikidata, so i guess i ran import pywikibot
enwiki = pywikibot.Site('en', 'wikipedia')
enwiki_repo = enwiki.data_repository()
page = pywikibot.Page(enwiki_repo, 'User:Nafiya_Ahmed/Outreachy 1'),, i guess this is the code to connect, but when I ran print(page.text) my output is blank, I am really in a fix that what should be the appropiate output?

This seems to work OK for me. Although if you changed the last line to 'page = pywikibot.Page(enwiki_repo, 'User:Nafiya-Ahmed/Outreachy 1')' like one of your comments, you will get a blank page due to the - rather than _. Perhaps try:
page = pywikibot.Page(enwiki, 'Sandbox')
print(page.text)
and make sure that works OK?

@Veeshah are you still having problems? Does @AlexGP's approach help you?

@Nafiya_Ahmed You can register it as a contribution on Outreachy, but you set 'Status: Not accepted or merged' until I've given you feedback on the code.

Hello @Mike_Peel I have started working on task 2. I would appreciate your feedback. Thank you.
here is the link: https://www.wikidata.org/wiki/User:Nancy_Sal/Outreachy_2

@Mike_Peel It worked, but I want to know why it worked for sandbox but not for User:Nafiya-Ahmed/Outreachy 1

@Mike_Peel It worked, but I want to know why it worked for sandbox but not for User:Nafiya-Ahmed/Outreachy 1

Try changing the - to an _ between your first and last name.

@Mike_Peel not only sandbox, it work for all kinds of things that has a Q number, like Harry potter etc... but not for anything which has a P number, why?

@Mike_Peel yes I used underscore too but it is still giving me a blank output

@Mike_Peel it worked thank you for the help

OK, glad you got it working. :-) Really odd that it's been so intermittent! Normally it works well, unless the page you're loading doesn't exist yet.

Hi everyone, hi @Mike_Peel, I've been able to log in and add my username to the user configuration file. I've been trying to set a script to load my wikidata page (point 3) but I couldn't do it at the moment. I'm trying to find the correct script to make the connection with the wikidata page. I've put on the console, inside the python interpreter:

import pywikibot

Then:

site = pywikibot.Site()
page = pywikibot.Page(site, u"User:Irene_Ovadia/Outreachy_1")
text = page.text

But I can't find if this is correct or not, I need some guidance at this stage.
Thank you

Hi everyone, hi @Mike_Peel, I've been able to log in and add my username to the user configuration file. I've been trying to set a script to load my wikidata page (point 3) but I couldn't do it at the moment. I'm trying to find the correct script to make the connection with the wikidata page. I've put on the console, inside the python interpreter:

import pywikibot

Then:

site = pywikibot.Site()
page = pywikibot.Page(site, u"User:Irene_Ovadia/Outreachy_1")
text = page.text

But I can't find if this is correct or not, I need some guidance at this stage.
Thank you

That looks right. If you print(text) - you should see the page text. Then you can modify and save it back. Are you getting any error messages so far?

That looks right. If you print(text) - you should see the page text. Then you can modify and save it back. Are you getting any error messages so far?

Thank you, that worked fine! I've had some errors at the moment of manually adding the login information in the user configuration file, but I was able to resolve them. I was hesitant about printing the script because I didn't understand why is the script printed on an 'Outreachy 1' page on Wikipedia that is created at the moment of runing it. I didn't know what to expect and that confused me.
I am now working on point 5, here is the link: https://www.wikidata.org/wiki/User:Irene_Ovadia/Outreachy_2

That looks right. If you print(text) - you should see the page text. Then you can modify and save it back. Are you getting any error messages so far?

Thank you, that worked fine! I've had some errors at the moment of manually adding the login information in the user configuration file, but I was able to resolve them. I was hesitant about printing the script because I didn't understand why is the script printed on an 'Outreachy 1' page on Wikipedia that is created at the moment of runing it. I didn't know what to expect and that confused me.
I am now working on point 5, here is the link: https://www.wikidata.org/wiki/User:Irene_Ovadia/Outreachy_2

Getting there - I've given you feedback on the talk page. :-)

Dear @Mike_Peel,
Sorry for such a delay with the second task, I was really struggling with it. ))) And with my MacBook, and with a new version of MacOS and etc.
But finally, please find below a link to my completed task 2:
https://www.wikidata.org/wiki/User:Pandamasha/Outreachy_2. I hope that I get it correctly. Waiting forward for your comments.
Have a wonderful week-end all!
With kindest regards, Masha (Pandamasha)

Hello @Mike_Peel! Could you please comment my second task, please?

Hello @Mike_Peel! Could you please comment my second task, please?

Hi, I've now replied on the talk page, sorry for the delay.

Hello @Mike_Peel! Could you please comment my second task, please?

Hi, I've now replied on the talk page, sorry for the delay.

Thank you very much, @Mike_Peel! I was just trying to use Zen of Python "Simple is better than complex." :-)))

Hello, @Mike_Peel . I am attaching the link to my 2nd task.
kindly comment on my 2nd task.
sorry for the delay.

https://www.wikidata.org/wiki/User:Suha_098/Outreachy_2

Hello, @Mike_Peel . I am attaching the link to my 2nd task.
kindly comment on my 2nd task.
sorry for the delay.

https://www.wikidata.org/wiki/User:Suha_098/Outreachy_2

I've replied on the talk page.

Hi, @Mike_Peel,

Kindly find the link to my second task - https://www.wikidata.org/wiki/User:Odohemma/Outreachy_2

Since there's no talk page: this looks good, you can mark it as accepted/completed for Outreachy.