Scraping Daily Weather Data From Ogimet
June 21, 2024 Vaidya R
Ogimet.com is a weather data portal that aggregates data from official sites from around the world. In the case of Indian cities, the data is pulled from IMD stations. The site has data on Indian cities from October 1999 till date.
The site operates using an index as known as WMO Index where WMO stands for World Meteorological Organisation, a United Nations body. The WMO Index is a five-digit number, and numbers starting from 42XXX and up to 433XX belong to India, for a total of 1399 indices.
For example, the IMD Bangalore station is 43295, while the HAL Airport is 43296. In Chennai, there are two stations, at Nungambakkam(43278) and the Airport one at Meenambakkam(43279). The Mumbai ones are at Colaba(43057) and the airport one at Santacruz(43003). A complete list of stations and their codes can be accessed in the IMD site here. Use the dropdown box called Station to see the list.
A sample output from Ogimet.com for IMD Bangalore for the month of May in 2024 looks like the below.
You can see that Bangalore recorded a maximum temperature of 38.2°C on the 1st of May and again on the 3rd of May. It also recorded rainfall of 30mm on the 19th of May and 26mm on the 14th, under the header “Prec. (mm)” which stands for Precipitation.
Scraper for Ogimet
Ogimet data for specific months and years need to be worked my modifying the URL. The URL for May 2024, the above image, is: “https://www.ogimet.com/cgi-bin/gsynres?lang=en&ord=REV&ndays=31&ano=2024&mes=05&day=31&hora=03&ind=43295“, where
- mes is the month number, 5 in the case of May.
- ano is the year, 2024 in this case.
- ndays is the number of days’ data to display
- day is the last date of the record, in this case it would be 31.
- ind is the WMO index, 43295 in the case of Bangalore.
- hora is the hour at which the data should be read. It is in UTC time zone, and we should use 03 as that corresponds to 8:30 AM IST which is when the day’s data is finalized by IMD.
A python scraper to get this data in the form a csv can be accessed at this location.
Prerequisites for the scraper
- You need to have python installed. If you don’t have python you can install it from here.
- You need to install the following packages. They can be installed using the command “pip install <package name>”, for eg: pip install csv, to install the csv package mentioned below.
- bs4
- csv
- requests
How to use the scraper.
- Run the python script through command line or any other python interface: “python scrape_ogimet_data.py”
- It will first ask for the year to scrape. “Enter year: “
Enter the year in 4 digits, for eg: 2023, and press “Enter” - You will then be asked for the station code. “Enter IMD station code(find it in the dropdown called Station here: https://cdsp.imdpune.gov.in/home_riturang_sn.php):”
Enter the five digit IMD code. For eg 43295 for Bangalore, and press “Enter”. - The rainfall, max temperature and min temperature for each date will be scraped to a csv file with the IMD code as name: 43295.csv in the above case. The file will be in same folder from where you are running python.
- Running the script again for additional years for the same station will append the data to the same file. So you can run from 2021 to 2023 one after the other and find it all in the same file.
The final output will have rows like this, the date, rainfall in mm, max and min temperature in °C.