How to Import Data from Wikipedia? by Dhafer Malouche


Wikipedia is the famous website containing a huge number of tables and it could be interesting to write an R code in order to be able to extract any data you would like to collect from it.

I will show in this tutorial how can you collect such data. I’m showing two examples: One on ISO codes of countries and another one on Tunisian Elections.

Example 1 (ISO2 Table)

I’m interested here in the table that gives the alphs-ISO2 code of all countries. when you visit the following URL:, you can notice that this table is the second table in this page.

For this purpose we need to install in our R environment htmltab R package.

> library(htmltab)
> u<-""
> doc <- htmltab(u,2)
> class(doc)
## [1] "data.frame"
> dim(doc)
## [1] 249   6
> colnames(doc)
## [1] "English short name (using title case)"
## [2] "Alpha-2 code"                         
## [3] "Alpha-3 code"                         
## [4] "Numeric code"                         
## [5] "Link to ISO 3166-2 subdivision codes" 
## [6] "Independent"

Let’s now use DT to display the table concerning only independent countries.

> library(DT)
> i=which(doc$Independent=="Yes")
> doc=doc[i,1:5]
> datatable(doc, filter = 'top',rownames = F,
+               extensions = 'Buttons',
+               options = list( dom = 'Bfrtip',pageLength = 25,
+                               autoWidth = TRUE,
+ buttons=c('copy','csv','excel','pdf','print',I('colvis'))
+ ))

Example 2 (Data Elections)

We’re now interested in the Wikipedia pages dealing with Tunisian Elections after 2011. We can find four pages :

> dt_muni <- htmltab("",4)
> dt_pres<-htmltab("",8)
> dt_par<-htmltab("",7)
> dt_anc<-htmltab("",8)
> head(dt_anc)
##             Parties >> Valid votes >> Total
## 2                          Ennahda Movement
## 3                 Congress for the Republic
## 4                          Popular Petition
## 5 Democratic Forum for Labour and Liberties
## 6              Progressive Democratic Party
## 7                            The Initiative
##   Votes >> 4,053,148 >> 4,308,888 % >> 94.06 >> 100.00 NA
## 2                       1,501,320                37.04 89
## 3                         353,041                 8.71 29
## 4                         273,362                 6.74 26
## 5                         284,989                 7.03 20
## 6                         159,826                 3.94 16
## 7                         129,120                 3.19  5
> head(dt_par)
##   Party, coalition and independent lists     Votes % Votes Seats % Seats
## 3                           Nidaa Tounes 1,279,941  37.56%    86  39.63%
## 4                       Ennahda Movement   947,014  27.79%    69  31.79%
## 5                   Free Patriotic Union   140,873   4.13%    16   7.37%
## 6                          Popular Front   124,046   3.64%    15   6.91%
## 7                            Afek Tounes   102,915   3.02%     8   3.68%
## 8              Congress for the Republic    69,794   2.05%     4   1.84%
##   Swing
## 3   N/A
## 4   −20
## 5   +15
## 6   +11
## 7    +5
## 8   −25
> head(dt_muni)
##   Parti, coalition ou liste    Voix     %   Conseillers   %.1    Maires
## 2                  Ennahdha 517 234 28,64 2 135 / 7 212 29,68 131 / 350
## 3              Nidaa Tounes 377 121 20,85 1 600 / 7 212 22,17  76 / 350
## 4         Courant démocrate  75 619  4,19   205 / 7 212  2,85   3 / 350
## 5           Front populaire  71 551  3,95   261 / 7 212  3,60   8 / 350
## 6              Union civile  31 883  1,77    66 / 7 212  0,92   2 / 350
## 7           Machrouu Tounes  26 013  1,44   124 / 7 212  1,72   0 / 350
> head(dt_pres)
##   Candidates >> Total               Parties >> Total
## 3   Beji Caid Essebsi                   Nidaa Tounes
## 4     Moncef Marzouki      Congress for the Republic
## 5       Hamma Hammami                  Popular Front
## 6        Hechmi Hamdi                Current of Love
## 7          Slim Riahi           Free Patriotic Union
## 8       Kamel Morjane National Destourian Initiative
##   First round >> Votes >> 3,267,569 First round >> % >> 100%
## 3                         1,289,384                   39.46%
## 4                         1,092,418                   33.43%
## 5                           255,529                    7.82%
## 6                           187,923                    5.75%
## 7                           181,407                    5.55%
## 8                            41,614                    1.27%