Share

Explainer: Understanding the HCES 2022 for Cities Dataset

August 16, 2024 Vaidya R

The Ministry of Statistics and Programme Implementation (MoSPI), Government of India, released the data from the Household Consumption and Expenditure Survey, 2022. While the website contains the summarised data, the raw data from the survey is available on the National Data Archive (NADA).

From OpenCity.in we have downloaded this raw data, processed the text files into accessible CSV files, and separated out the content into seven major cities – Bengaluru, Chennai, Delhi, Hyderabad, Kolkata, Mumbai and Pune.

What is in each dataset?

Each dataset has the following resources:

  1. CSV data for 15 levels, as defined by MoSPI.
    Each resource is called “HCES Level 01 Data”, “HCES Level 02 Data” and so on till “HCES Level 15 Data”.
  2. A layout file in XLSX format called “HCES 2022 Layout File”. This file has details on each level – the questions asked and the answer format.
  3. A guide in PDF format called “HCES 2022 Questionnaire Guide”. This had additional details on the values.

Understanding the Levels

Before we delve into the levels, we need to understand the “Common ID”. The common ID is a field with details on the data like state, district, survey region, sub-division etc. In the layout guide, the common ID is referred to as a single field.

However, in each of the level files the common ID is split into its components with header names. The header names are available only till the Questionnaire Number and Level. The “Level” column will match the level file you are looking at.

Where the header says “Col 1” or “Col 2”, it needs to be correlated with the question in the layout file. The “Col” column in the layout file can guide you accordingly. For e.g. for Level 02, “Col 1” will be the question following the Level in the Layouts file for Level 02, which is: “Person Srl No.” and Col 3 would be “Gender”. You can see the 1 and 3 under “Col” below.

Because of these reasons, before going into the levels data, it is important to look at the Layouts file and the questionnaire guide. They can guide you on which file to look at to get the answers you are looking for.

Layouts and Guide

The layouts file tells you what each column in each level means. For e.g: In level 3, it shows that there are 34 columns (including the Common ID as one field) and the column after Level is the “HH Size” or household size. The Layout file also has a column called Question which is a set of numbers like “4.1” or “6.2” which is a pointer into the Questionnaire Guide to look up the relevant sections to know more about the question asked.

The Questionnaire Guide needs to be looked at in tandem with the layouts file. In most cases while it points to the question asked, the answer can be self-evident. For e.g. the answer to the household size is the household size itself.

In some cases, the answer is a code which needs to be read with the help of the guide. For e.g. in Level 06, the consumption is by item code and it refers you to section 7.1 and 7.2 of the guide which has a list of processed food items and their codes – an item code of “291” is for chocolates, and the quantity and value in Rs. that follow for that row refer to the consumption of chocolates in the household over the last 7 days.

For e.g. a household in Hyderabad spent Rs. 80 on chocolates, and another spent Rs. 50 on item 290 which is cakes and pastries in the last 7 days.

What do the levels mean?

  1. Level 01 is just the common ID. This has information about the state ID, district and sub-divisions at the district level. Even though the layouts file shows the common ID as one field, in our CSVs this common ID is split in all the levels to make it easier to understand the district and sub-district data.
  2. Level 02 Data has details of the persons surveyed. For each person questioned it has data on age, gender, education level, marital status and meals in the last 30 days.
  3. Level 03 Data has details on occupation – agricultural or non, employment, religion of the household, land ownership, state of housing, access to cooking fuel, water, toilets and drainage, and health.
  4. Level 04 data has details on ration cards and purchases through the cards, as well as online purchases of fruits, vegetables, milk, protein and other food items.
  5. Level 05 data has details on consumption of home produced goods and total consumption. These goods include
    • cereals and cereal substitutes (rice, wheat, millets),
    • pulses and pulse products like soya,
    • sugar and salt,
    • milk and milk products,
    • vegetables, fruits and dry fruits
    • eggs, fish and meat.
  6. Level 06 data has details on consumption of served packaged food and packaged processed food over the last week.
  7. Level 07 data has details on access to government provided free/subsidized items like
    • cooking fuels like LPG,
    • education for children – fees, schoolbags, stationary, textbooks,
    • electricity,
    • health in terms of hospitalizations and spending on healthcare
    • It also includes online spending on many of these items.
      The questions are more about access to government subsidies for the above rather than quantitative spending on the items.
  8. Level 08 data has details on quantity and spending on energy – fuels, light, electricity. This is different from the questions on Level 07 as there it was about access to different types of fuel, while here it is in terms of quantity and spending.
  9. Level 09 data has details on specific items consumed for daily activities. This includes a broad variety of items like
    • toiletries like soaps, shampoos and shaving items,
    • cleaning items like floor cleaners, mosquito repellents, electric bulbs, plastic items, batteries.
    • schools – fees, books,
    • hospitalisation, non-hospitalization medical expenses,
    • transport – bus fares and railway fare, petrol/diesel spending,
    • spending on telephone, domestic help, watchman, pets, maintenance of house, etc.
    • expenditure on entertainment and
    • rent – house as well as consumer goods.
  10. Level 10 data has details on pan, tobacco and alcohol consumed, in terms of quantity and spending.
  11. Level 11 data has details on whether items were bought online and the types of items that were bought online – electronic items, household appliances. It also checks holding of consumer items like laptops, phones, vehicles, refrigerators, ACs and washing machines
  12. Level 12 data has spending on clothing, footwear and bedding over the last 30 days in a household.
  13. Level 13 data has details on expenditure for purchase and construction (including repair and maintenance) of durable goods for domestic use during the last 365 days. These durable goods include
    • personal use items – umbrellas, phones, watches, etc,
    • transport equipment – vehicles, tyres,
    • sports goods like badminton rackets, cricket bats, fitness machines, etc,
    • medical equipment like wheelchairs, BP monitors, sugar monitors etc,
    • cooking and household items – washing machines, refrigerators, pressure cooker etc,
    • crockery and utensils,
    • furniture and fixtures,
    • recreational equipment like TVs, CD/DVD players, cameras, etc,
    • construction material for minor repairs, plugs and switches etc.
    • Jewellery and other ornamental items
  14. Level 14 data is a summary of items consumed in all the earlier sections. The summary divides the items into three groups:
    • Food consumption. Basically total expenditure on Levels 3, 4, 5 and 6 can be accessed here.
    • Fuel, education, medical, entertainment, transport, rent and intoxicants. Total expenditure on levels – 7, 8, 9 and 10 can be accessed here.
    • Personal goods – clothing, footwear, equipment for transport, medical, sports, cooking and household etc. Total expenditure on levels 11, 12 and 13.
  15. Level 15 data has final details on total consumption per month of a household.

To give you an idea at a household level, Level 15 is the total expenditure of the household over a month, which is the sum of expenditure on different types of goods as captured in Level 14. More granular expenditure on goods like on cereals, pulses, toiletries can be drilled into at lower level numbers.

Related