Max file-name length in an EXT4 file system.

A recent discussion at work brought up the question “What can be the length of a file name in EXT4”. Or in other words, what would be the maximum character length of the name for a file in EXT4?

Wikipedia states that it’s 255 Bytes, but how does that come to be? Is it 255 Bytes or 255 characters?

In the kernel source for the 2.6 kernel series (the question was for a RHEL6/EXT4 combination), in  fs/ext4/ext4.h, we’d be able to see the following:


#define EXT4_NAME_LEN 255

struct ext4_dir_entry {
    __le32 inode;             /* Inode number */
    __le16 rec_len;           /* Directory entry length */
    __le16 name_len;          /* Name length */
    char name[EXT4_NAME_LEN]; /* File name */
};

/*
* The new version of the directory entry. Since EXT4 structures are
* stored in intel byte order, and the name_len field could never be
* bigger than 255 chars, it's safe to reclaim the extra byte for the
* file_type field.
*/

struct ext4_dir_entry_2 {
    __le32 inode;             /* Inode number */
    __le16 rec_len;           /* Directory entry length */
    __u8 name_len;            /* Name length */
    __u8 file_type;
    char name[EXT4_NAME_LEN]; /* File name */
};

This shows that there are two versions of the directory entry structure, ie.. ext4_dir_entry and ext4_dir_entry_2

A directory entry structure carries the file/folder name and the corresponding inode number under every directory.

Both structs use an element named name_len to denote the length of the file/folder name.

If the EXT filesystem feature filetype is not set, the directory entry structure falls to the first method ext4_dir_entry, else it’s the second, ie.. ext4_dir_entry_2.

By default, the file system feature filetype is set, hence the directory entry structure is ext4_dir_entry_2 . As seen above, in this case, the name_len field is set to 8 bits.

__u8 represents an unsigned 8-bit integer in C, and can store values from 0 to 255.

ie.. 2^8 = 255 (0 t0 255 == 256)

ext4_dir_entry has a name_len of __le16, but it seems that the file-name length can only go to a max of 256.

Observations:

  1. The maximum name length is 255 characters on Linux machines.
  2. The actual name length of a file/folder is stored in name_len in each directory entry, under its parent folder. So if the file name length is 5 characters, 5 would be the value set for name_len for that particular file. ie.. the actual length.
  3. A character will consume a byte of storage, so the number of characters in a file name will map to the respective number bytes. If so, a file with a name_len of 5 will be using 5 bytes of memory to store the name.

Hence, name_len denotes the number of characters that a file can have. Since U8 is 8-bits, name_len can store a file name with upto 255 chars.

Now the actual memory being consumed for storing these characters is not denoted by name_len. Since the size of a character translates to a byte, the maximum size wrt memory that a file name can have is 255 Bytes.

NOTE:

The initial dir entry structure ext4_dir_entry had __le16 for name_len, it was later re-sized to __u8 in ext4_dir_entry_2 , by culling 8 bits from the existing 16 bits of name_len.

The remaining free space culled from name_len was assigned to store the file type, in ext4_dir_entry_2. It was named file_type with size __u8.

file_type helps to identity the file types such as regular files, sockets, character devices, block devices etc..

References:

  1. RHEL6 kernel-2.6.32-573.el6 EXT4 header file (ext4.h)
  2. EXT4 Wiki – Disk layout
  3. http://unix.stackexchange.com/questions/32795/what-is-the-maximum-allowed-filename-and-folder-size-with-ecryptfs

Inheritance and super() – Object Oriented Programming

super() is a feature through which inherited methods can be accessed, which has been overridden in a class. It can also help with the MRO lookup order in case of multiple inheritance. This may not be obvious first, but a few examples should help to drive the point home.

Inheritance and method overloading was discussed in a previous post, where we saw how inherited methods can be overloaded or enhanced in the child classes.

In many scenarios, it’s needed to overload an inherited method, but also call the actual method defined in the Parent class.

Let’s start off with a simple example based on Inheritance, and build from there.

Example 0:

class MyClass(object):

    def func(self):
        print("I'm being called from the Parent class!")

class ChildClass(MyClass):
    pass

my_instance_1 = ChildClass()
my_instance_1.func()

This outputs:

In [18]: %run /tmp/super-1.py
I'm being called from the Parent class

In Example 0, we have two classes, MyClass and ChildClass. The latter inherits from the former, and the parent class MyClass has a method named func defined.

Since ChildClass inherits from MyClass, the child class has access to the methods defined in the parent class. An instance is created my_instance_2, for ChildClass.

Calling my_instance_1.func() will print the statement from the Parent class, due to the inheritance.

Building up on the first example:

Example 1:

class MyClass(object):

    def func(self):
        print("I'm being called from the Parent class")

class ChildClass(MyClass):

    def func(self):
        print("I'm being called from the Child class")

my_instance_1 = MyClass()
my_instance_2 = ChildClass()

my_instance_1.func()
my_instance_2.func()

This outputs:

In [19]: %run /tmp/super-1.py
I'm being called from the Parent class
I'm being called from the Child class

This example has a slight difference, both the child class as well as the parent class have the same method defined, ie.. func. In this scenario, the parent class’ method is overridden by the child class method.

ie.. if we call the func() method from the instance of ChildClass, it need not go a fetch the method from its Parent class, since it’s already defined locally.

NOTE: This is due to the Method Resolution Order, discussed in an earlier post.

But what if there is a scenario that warranties the need for specifically calling methods defined in the Parent class, from the instance of a child class?

ie.. How to call the methods defined in the Parent class, through the instance of the Child class, even if the Parent class method is overloaded in the Child class?

In such a case, the inbuilt function super() can be used. Let’s add to the previous example.

Example 2:

class MyClass(object):

    def func(self):
        print("I'm being called from the Parent class")

class ChildClass(MyClass):

    def func(self):
        print("I'm actually being called from the Child class")
        print("But...")
        # Calling the `func()` method from the Parent class.
        super(ChildClass, self).func()

my_instance_2 = ChildClass()
my_instance_2.func()

This outputs:

In [21]: %run /tmp/super-1.py
I'm actually being called from the Child class
But...
I'm being called from the Parent class

How is the code structured?

  1. We have two classes MyClass and ChildClass.
  2. The latter is inheriting from the former.
  3. Both classes have a method named func
  4. The child class ChildClass is instantiated as my_instance_2
  5. The func method is called from the instance.

How does the code work?

  1. When the func method is called, the interpreter searches it using the Method Resolution Order, and find the method defined in the class ChildClass.
  2. Since it finds the method in the child class, it executes it, and prints the string “I’m actually being called from the Child class”, as well “But…”
  3. The next statement is super which calls the method func defined in the parent class of ChildClass
  4. Since the control is now passed onto the func method in the Parent class via super, the corresponding print() statement is printed to stdout.

Example 2 can also be re-written as :

class MyClass(object):

    def func(self):
        print("I'm being called from the Parent class")

class ChildClass(MyClass):

    def func(self):
        print("I'm actually being called from the Child class")
        print("But...")
        # Calling the `func()` method from the Parent class.
        # super(ChildClass, self).func()
        MyClass.func(self)  # Call the method directly via Parent class

my_instance_2 = ChildClass()
my_instance_2.func()

 

NOTE: The example above uses the Parent class directly to access it’s method. Even though it works, it is not the best way to do it since the code is tied to the Parent class name. If the Parent class name changes, the child/sub class code has to be changed as well. 

Let’s see another example for  super() . This is from our previous article on Inheritance and method overloading.

Example 3:

import abc

class MyClass(object):

    __metaclass__ = abc.ABCMeta

    def my_set_val(self, value):
        self.value = value

    def my_get_val(self):
        return self.value

    @abc.abstractmethod
    def print_doc(self):
        return

class MyChildClass(MyClass):

    def my_set_val(self, value):
        if not isinstance(value, int):
            value = 0
        super(MyChildClass, self).my_set_val(self)

    def print_doc(self):
        print("Documentation for MyChild Class")

my_instance = MyChildClass()
my_instance.my_set_val(100)
print(my_instance.my_get_val())
print(my_instance.print_doc())

The code is already discussed here. The my_set_val method is defined in both the child class as well as the parent class.

We overload the my_set_val method defined in the parent class, in the child class. But after enhancing/overloading it, we call the my_set_val method specifically from the Parent class using super() and thus enhance it.

Takeaway:

  1. super() helps to specifically call the Parent class method which has been overridden in the child class, from the child class.
  2. The super() in-built function can be used to call/refer the Parent class without explicitly naming them. This helps in situations where the Parent class name may change. Hence, super() helps in avoiding strong ties with class names and increases maintainability.
  3. super() helps the most when there are multiple inheritance happening, and the MRO ends up being complex. In case you need to call a method from a specific parent class, use super().
  4. There are multiple ways to call a method from a Parent class.
    1. <Parent-Class>.<method>
    2. super(<ChildClass>, self).<method>
    3. super().<method>

References:

  1. https://docs.python.org/2/library/functions.html#super
  2. https://rhettinger.wordpress.com/2011/05/26/super-considered-super/
  3. https://stackoverflow.com/questions/222877/how-to-use-super-in-python