How to convert JSON to Python Object

How to convert JSON to Python Object

Python’s json library has many utilities for encoding and decoding data in JSON format. In particular, the json.load() method decodes an JSON read as a file, and the json.loads() decode an JSON read as a string. In general, when decoding JSON files, the data is converted to Python dictionaries, but it is possible to convert it to a custom object by using the parameter object_hook.

For instance, suppose you have the following JSON object:

json_obj = """{
  "name" : "Felipe",
  "email" : "[email protected]",
  "age" : 29
}"""

and the following class:

class User():
  name : str
  email : str
  age : int
  def __init__(self, input):
      self.name = input.get("name")
      self.email = input.get("email")
      self.age = input.get("age")

If we call json.loads() with User as the object_hook parameter, the User.__init__() method will be called with the JSON’s corresponding dict as input.

import json

user = json.loads(json_obj, object_hook = User)
print(f"User {user.name}, age {user.age}, email {user.email}")
User Felipe, age 29, email [email protected]

But what if you have a nested JSON?

json.loads() actually calls the object_hook function every time it reads a fully formed JSON object from the string. Consider the following JSON, returned from the Random User Generator API

json_obj = """{
            "gender": "male",
            "name": {
                "title": "Mr",
                "first": "Ian",
                "last": "Walters"
            },
            "location": {
                "street": {
                    "number": 3161,
                    "name": "Saddle Dr"
                },
                "city": "Bendigo",
                "state": "Western Australia",
                "country": "Australia",
                "postcode": 4285,
                "coordinates": {
                    "latitude": "-84.7903",
                    "longitude": "-29.1020"
                },
                "timezone": {
                    "offset": "+9:00",
                    "description": "Tokyo, Seoul, Osaka, Sapporo, Yakutsk"
                }
            },
            "email": "[email protected]",
            "login": {
                "uuid": "6ee5b2e8-01c3-4314-8f7f-80059f5dd9ec",
                "username": "lazyzebra585",
                "password": "walter",
                "salt": "afXmogsa",
                "md5": "a40e87023b57a4a60c7cb398584cbac3",
                "sha1": "74caf43400be38cce60a8da2e6d1c367246505c2",
                "sha256": "1becdf34bcc6704726c7e9b38821a5792f9dd0689d30789fb5e099a6e51e860a"
            },
            "dob": {
                "date": "1947-06-06T02:45:41.895Z",
                "age": 75
            },
            "registered": {
                "date": "2003-03-25T00:15:32.791Z",
                "age": 19
            },
            "phone": "06-9388-6976",
            "cell": "0469-101-424",
            "id": {
                "name": "TFN",
                "value": "561493929"
            },
            "picture": {
                "large": "https://randomuser.me/api/portraits/men/32.jpg",
                "medium": "https://randomuser.me/api/portraits/med/men/32.jpg",
                "thumbnail": "https://randomuser.me/api/portraits/thumb/men/32.jpg"
            },
            "nat": "AU"
        }"""

Let’s print the decoded JSON at each step to see what happens:

json.loads(json_obj, object_hook = print)
{'title': 'Mr', 'first': 'Ian', 'last': 'Walters'}
{'number': 3161, 'name': 'Saddle Dr'}
{'latitude': '-84.7903', 'longitude': '-29.1020'}
{'offset': '+9:00', 'description': 'Tokyo, Seoul, Osaka, Sapporo, Yakutsk'}
{'street': None, 'city': 'Bendigo', 'state': 'Western Australia', 'country': 'Australia', 'postcode': 4285, 'coordinates': None, 'timezone': None}
{'uuid': '6ee5b2e8-01c3-4314-8f7f-80059f5dd9ec', 'username': 'lazyzebra585', 'password': 'walter', 'salt': 'afXmogsa', 'md5': 'a40e87023b57a4a60c7cb398584cbac3', 'sha1': '74caf43400be38cce60a8da2e6d1c367246505c2', 'sha256': '1becdf34bcc6704726c7e9b38821a5792f9dd0689d30789fb5e099a6e51e860a'}
{'date': '1947-06-06T02:45:41.895Z', 'age': 75}
{'date': '2003-03-25T00:15:32.791Z', 'age': 19}
{'name': 'TFN', 'value': '561493929'}
{'large': 'https://randomuser.me/api/portraits/men/32.jpg', 'medium': 'https://randomuser.me/api/portraits/med/men/32.jpg', 'thumbnail': 'https://randomuser.me/api/portraits/thumb/men/32.jpg'}
{'gender': 'male', 'name': None, 'location': None, 'email': '[email protected]', 'login': None, 'dob': None, 'registered': None, 'phone': '06-9388-6976', 'cell': '0469-101-424', 'id': None, 'picture': None, 'nat': 'AU'}

So json.loads() calls the object_hook function every time it reads a fully formed JSON, that is, every time it closes a bracket pair {}. Then, it creates the whole JSON object by using the result of the object_hook function – note the None (the return value of print) in the last printed line.

We will show two work-arounds for this issue. The first is to modify our User.__init__() method to be more flexible with respect to the input. We will do this using the __dict__ attribute. Every Python object has a __dict__ attribute that holds every attribute’s name and value. Our modified __init__() method will update this dictionary:

class User():
  def __init__(self, input):
      self.__dict__.update(input)

user = json.loads(json_obj, object_hook = User)
print(f"User {user.name.first} {user.name.last}, age {user.dob.age}, email {user.email}")

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

User Ian Walters, age 75, email [email protected]

Another possible work-around is to use the collections.namedtuple class:

from collections import namedtuple

def create_user(input):
    User = namedtuple('User', input.keys())
    return User(**input)
    
user = json.loads(json_obj, object_hook=create_user)
print(f"User {user.name.first} {user.name.last}, age {user.dob.age}, email {user.email}")
User Ian Walters, age 75, email [email protected]

where namedtuple('User', input.keys()) creates a tuple subclass called User with the input’s keys as attributes names, and User(**input) assigns the corresponding values for the attributes.

Time Stamp:

More from Stackabuse