YAML Processing

Content:

  1. Introduction
  2. Object tree
  3. Class binding

Introduction:

YAML (YAML Ain't Markup Language) is a format that was invented in 2001 and in recent years has become very popular for configuration purposes. Especially the cloud and container world like YAML.

Basic YAML is very erasy to write and read, so the popularity is understandable. But note that advanced YAML can be a bit tricky.

The YAML used in this article will be extremely basic.

The examples will focus on reading of configuration files since that is a very relevant use case.

Most examples will be using demo1.yaml:

ab:
  a: 1
  b: 2
c: 3
de:
  d: ABC
  e: DEF
f: GHI

Object tree:

One type of YAML libraries work on generic types and access data via named fields.

This sort of representation does not require any known classes to be defined, but tend to be a bit cumbersome to write.

Several YAML libraries exist for Java including SnakeYAML and YamlBeans.

This example will use SnakeYaml.

package yamlproc;

import java.io.FileInputStream;
import java.io.IOException;
import java.util.Map;

import org.yaml.snakeyaml.Yaml;

public class SnakeTree {
    public static void main(String[] args) throws IOException {
        Yaml parser = new Yaml();
        @SuppressWarnings("unchecked")
        Map<String, Object> cfg = (Map<String, Object>)parser.load(new FileInputStream("/work/demo1.yaml"));
        @SuppressWarnings("unchecked")
        Map<String, Object> ab = (Map<String, Object>)cfg.get("ab");
        @SuppressWarnings("unchecked")
        Map<String, Object> de = (Map<String, Object>)cfg.get("de");
        int a = (Integer)ab.get("a");
        int b = (Integer)ab.get("b");
        int c = (Integer)cfg.get("c");
        String d = (String)de.get("d");
        String e = (String)de.get("e");
        String f = (String)cfg.get("f");
        System.out.printf("%d %d %d %s %s %s\n", a, b, c, d, e, f);
    }
}

Several YAML libraries exist for Java including SnakeYAML and YamlBeans.

This example will use YamlBeans.

package yamlproc;

import java.io.FileReader;
import java.io.IOException;
import java.util.Map;

import com.esotericsoftware.yamlbeans.YamlReader;

public class BeansTree {
    public static void main(String[] args) throws IOException {
        YamlReader parser = new YamlReader(new FileReader("/work/demo1.yaml"));
        @SuppressWarnings("unchecked")
        Map<String, Object> cfg = (Map<String, Object>)parser.read();
        @SuppressWarnings("unchecked")
        Map<String, Object> ab = (Map<String, Object>)cfg.get("ab");
        @SuppressWarnings("unchecked")
        Map<String, Object> de = (Map<String, Object>)cfg.get("de");
        int a = Integer.parseInt((String)ab.get("a"));
        int b = Integer.parseInt((String)ab.get("b"));
        int c = Integer.parseInt((String)cfg.get("c"));
        String d = (String)de.get("d");
        String e = (String)de.get("e");
        String f = (String)cfg.get("f");
        System.out.printf("%d %d %d %s %s %s\n", a, b, c, d, e, f);
    }
}

Several YAML libraries exist for .NET including YamlDotNet which this example will use.

using System;
using System.Collections.Generic;
using System.IO;

using YamlDotNet.Serialization;

namespace YamlProc.Tree
{
    public class Program
    {
        public static void Main(string[] args)
        {
            IDeserializer dser = (new DeserializerBuilder()).Build();
            IDictionary<object, object> cfg = (IDictionary<object, object>)dser.Deserialize(new StreamReader(@"C:\Work\demo1.yaml"));
            IDictionary<object, object> ab = (IDictionary<object, object>)cfg["ab"];
            IDictionary<object, object> de = (IDictionary<object, object>)cfg["de"];
            int a = int.Parse((string)ab["a"]);
            int b = int.Parse((string)ab["b"]);
            int c = int.Parse((string)cfg["c"]);
            string d = (string)de["d"];
            string e = (string)de["e"];
            string f = (string)cfg["f"];
            Console.WriteLine("{0} {1} {2} {3} {4} {5}", a, b, c, d, e, f);
        }
    }
}

Several YAML libraries exist for Python including PyYAML this example will use.

import yaml

cfg = yaml.load(open('/work/demo1.yaml', 'r'), Loader=yaml.FullLoader)
a = cfg['ab']['a']
b = cfg['ab']['b']
c = cfg['c']
d = cfg['de']['d']
e = cfg['de']['e']
f = cfg['f']
print('%d %d %d %s %s %s' % (a,b,c,d,e,f))

There exist a yaml extension for PHP.

<?php
$cfg = yaml_parse_file('C:\work\demo1.yaml');
$a = $cfg['ab']['a'];
$b = $cfg['ab']['b'];
$c = $cfg['c'];
$d = $cfg['de']['d'];
$e = $cfg['de']['e'];
$f = $cfg['f'];
echo sprintf('%d %d %d %s %s %s', $a, $b, $c, $d, $e, $f);
?>

Class binding:

Another type of YAML libraries work by binding (mapping) class definitions to YAML structures.

This sort of representation require use of defined classes to be defined, but is very convenient to use.

Two flavors of binding exist:

Mapping done in code:

Several YAML libraries exist for Java including SnakeYAML and YamlBeans.

This example will use SnakeYaml.

package yamlproc;

public class AB {
    private int a;
    private int b;
    public int getA() {
        return a;
    }
    public void setA(int a) {
        this.a = a;
    }
    public int getB() {
        return b;
    }
    public void setB(int b) {
        this.b = b;
    }
}
package yamlproc;

public class DE {
    private String d;
    private String e;
    public String getD() {
        return d;
    }
    public void setD(String d) {
        this.d = d;
    }
    public String getE() {
        return e;
    }
    public void setE(String e) {
        this.e = e;
    }
}
package yamlproc;

public class Config {
    private AB ab;
    private int c;
    private DE de;
    private String f;
    public AB getAb() {
        return ab;
    }
    public void setAb(AB ab) {
        this.ab = ab;
    }
    public int getC() {
        return c;
    }
    public void setC(int c) {
        this.c = c;
    }
    public DE getDe() {
        return de;
    }
    public void setDe(DE de) {
        this.de = de;
    }
    public String getF() {
        return f;
    }
    public void setF(String f) {
        this.f = f;
    }
}

package yamlproc;

import java.io.FileInputStream;
import java.io.IOException;

import org.yaml.snakeyaml.Yaml;

public class SnakeBindCode {
    public static void main(String[] args) throws IOException {
        Yaml parser = new Yaml();
        Config cfg = parser.loadAs(new FileInputStream("/work/demo1.yaml"), Config.class);
        int a = cfg.getAb().getA();
        int b = cfg.getAb().getB();
        int c = cfg.getC();
        String d = cfg.getDe().getD();
        String e = cfg.getDe().getE();
        String f = cfg.getF();
        System.out.printf("%d %d %d %s %s %s\n", a, b, c, d, e, f);
    }
}

The class is specified as an argument to the parser.

Several YAML libraries exist for Java including SnakeYAML and YamlBeans.

This example will use YamlBeans.

package yamlproc;

public class AB {
    private int a;
    private int b;
    public int getA() {
        return a;
    }
    public void setA(int a) {
        this.a = a;
    }
    public int getB() {
        return b;
    }
    public void setB(int b) {
        this.b = b;
    }
}
package yamlproc;

public class DE {
    private String d;
    private String e;
    public String getD() {
        return d;
    }
    public void setD(String d) {
        this.d = d;
    }
    public String getE() {
        return e;
    }
    public void setE(String e) {
        this.e = e;
    }
}
package yamlproc;

public class Config {
    private AB ab;
    private int c;
    private DE de;
    private String f;
    public AB getAb() {
        return ab;
    }
    public void setAb(AB ab) {
        this.ab = ab;
    }
    public int getC() {
        return c;
    }
    public void setC(int c) {
        this.c = c;
    }
    public DE getDe() {
        return de;
    }
    public void setDe(DE de) {
        this.de = de;
    }
    public String getF() {
        return f;
    }
    public void setF(String f) {
        this.f = f;
    }
}
package yamlproc;

import java.io.FileReader;
import java.io.IOException;

import com.esotericsoftware.yamlbeans.YamlReader;

public class BeansBindCode {
    public static void main(String[] args) throws IOException {
        YamlReader parser = new YamlReader(new FileReader("/work/demo1.yaml"));
        Config cfg = parser.read(Config.class);
        int a = cfg.getAb().getA();
        int b = cfg.getAb().getB();
        int c = cfg.getC();
        String d = cfg.getDe().getD();
        String e = cfg.getDe().getE();
        String f = cfg.getF();
        System.out.printf("%d %d %d %s %s %s\n", a, b, c, d, e, f);
    }
}

The class is specified as an argument to the parser.

Several YAML libraries exist for .NET including YamlDotNet which this example will use.

using System;
using System.IO;

using YamlDotNet.Serialization;
using YamlDotNet.Serialization.NamingConventions;

namespace YamlProc.BindCode
{
    public class AB
    {
        public int A { get; set; }
        public int B { get; set; }
    }
    public class DE
    {
        public string D { get; set; }
        public string E { get; set; }
    }
    public class Config
    {
        public AB AB { get; set; }
        public int C { get; set; }
        public DE DE { get; set; }
        public string F { get; set; }
    }
    public class Program
    {
        public static void Main(string[] args)
        {
            IDeserializer dser = (new DeserializerBuilder()).WithNamingConvention(LowerCaseNamingConvention.Instance).Build();
            Config cfg = dser.Deserialize<Config>(new StreamReader(@"C:\Work\demo1.yaml"));
            int a = cfg.AB.A;
            int b = cfg.AB.B;
            int c = cfg.C;
            string d = cfg.DE.D;
            string e = cfg.DE.E;
            string f = cfg.F;
            Console.WriteLine("{0} {1} {2} {3} {4} {5}", a, b, c, d, e, f);
        }
    }
}

The class is specified as a generic parameter to the parser.

Mapping done in yaml via tag:

Note that the semantics of tags for mapping is not fully portable between different libraries.

Also note that allowing the input to control what class is used for deserialization without any limits is a security risk!

Several YAML libraries exist for Java including SnakeYAML and YamlBeans.

This example will use SnakeYaml.

!!yamlproc.Config
ab:
  a: 1
  b: 2
c: 3
de:
  d: ABC
  e: DEF
f: GHI
package yamlproc;

public class AB {
    private int a;
    private int b;
    public int getA() {
        return a;
    }
    public void setA(int a) {
        this.a = a;
    }
    public int getB() {
        return b;
    }
    public void setB(int b) {
        this.b = b;
    }
}
package yamlproc;

public class DE {
    private String d;
    private String e;
    public String getD() {
        return d;
    }
    public void setD(String d) {
        this.d = d;
    }
    public String getE() {
        return e;
    }
    public void setE(String e) {
        this.e = e;
    }
}
package yamlproc;

public class Config {
    private AB ab;
    private int c;
    private DE de;
    private String f;
    public AB getAb() {
        return ab;
    }
    public void setAb(AB ab) {
        this.ab = ab;
    }
    public int getC() {
        return c;
    }
    public void setC(int c) {
        this.c = c;
    }
    public DE getDe() {
        return de;
    }
    public void setDe(DE de) {
        this.de = de;
    }
    public String getF() {
        return f;
    }
    public void setF(String f) {
        this.f = f;
    }
}
package yamlproc;

import java.io.FileInputStream;
import java.io.IOException;

import org.yaml.snakeyaml.Yaml;

public class SnakeBindTag {
    public static void main(String[] args) throws IOException {
        Yaml parser = new Yaml();
        Config cfg = (Config)parser.load(new FileInputStream("/work/demo2snake.yaml"));
        int a = cfg.getAb().getA();
        int b = cfg.getAb().getB();
        int c = cfg.getC();
        String d = cfg.getDe().getD();
        String e = cfg.getDe().getE();
        String f = cfg.getF();
        System.out.printf("%d %d %d %s %s %s\n", a, b, c, d, e, f);
    }
}

Class names is picked by the !! tag in the yaml.

Regarding sceurity then:

Yaml parser = new Yaml();

allows !! to specify any class, while:

Yaml parser = new Yaml(new SafeConstructor());

does not allow !! to specify class, while:

Yaml parser = new Yaml(new Constructor(Config.class));

allows !! to specify only Config class.

Several YAML libraries exist for Java including SnakeYAML and YamlBeans.

This example will use YamlBeans.

!yamlproc.Config
ab:
  a: 1
  b: 2
c: 3
de:
  d: ABC
  e: DEF
f: GHI
package yamlproc;

public class AB {
    private int a;
    private int b;
    public int getA() {
        return a;
    }
    public void setA(int a) {
        this.a = a;
    }
    public int getB() {
        return b;
    }
    public void setB(int b) {
        this.b = b;
    }
}
package yamlproc;

public class DE {
    private String d;
    private String e;
    public String getD() {
        return d;
    }
    public void setD(String d) {
        this.d = d;
    }
    public String getE() {
        return e;
    }
    public void setE(String e) {
        this.e = e;
    }
}
package yamlproc;

public class Config {
    private AB ab;
    private int c;
    private DE de;
    private String f;
    public AB getAb() {
        return ab;
    }
    public void setAb(AB ab) {
        this.ab = ab;
    }
    public int getC() {
        return c;
    }
    public void setC(int c) {
        this.c = c;
    }
    public DE getDe() {
        return de;
    }
    public void setDe(DE de) {
        this.de = de;
    }
    public String getF() {
        return f;
    }
    public void setF(String f) {
        this.f = f;
    }
}
package yamlproc;

import java.io.FileReader;
import java.io.IOException;

import com.esotericsoftware.yamlbeans.YamlReader;

public class BeansBindTag {
    public static void main(String[] args) throws IOException {
        YamlReader parser = new YamlReader(new FileReader("/work/demo2beans.yaml"));
        Config cfg = (Config)parser.read();
        int a = cfg.getAb().getA();
        int b = cfg.getAb().getB();
        int c = cfg.getC();
        String d = cfg.getDe().getD();
        String e = cfg.getDe().getE();
        String f = cfg.getF();
        System.out.printf("%d %d %d %s %s %s\n", a, b, c, d, e, f);
    }
}

Class names is picked by the ! tag in the yaml.

Several YAML libraries exist for .NET including YamlDotNet which this example will use.

!!YamlProc.Bind.Config
ab:
  a: 1
  b: 2
c: 3
de:
  d: ABC
  e: DEF
f: GHI
using System;
using System.IO;

using YamlDotNet.Serialization;
using YamlDotNet.Serialization.NamingConventions;

namespace YamlProc.BindTag
{
    public class AB
    {
        public int A { get; set; }
        public int B { get; set; }
    }
    public class DE
    {
        public string D { get; set; }
        public string E { get; set; }
    }
    public class Config
    {
        public AB AB { get; set; }
        public int C { get; set; }
        public DE DE { get; set; }
        public string F { get; set; }
    }
    public class Program
    {
        public static void Main(string[] args)
        {
            IDeserializer dser = (new DeserializerBuilder()).WithNamingConvention(LowerCaseNamingConvention.Instance).WithTagMapping("tag:yaml.org,2002:YamlProc.Bind.Config", typeof(Config)).Build();
            Config cfg = (Config)dser.Deserialize(new StreamReader(@"C:\Work\demo2net.yaml"));
            int a = cfg.AB.A;
            int b = cfg.AB.B;
            int c = cfg.C;
            string d = cfg.DE.D;
            string e = cfg.DE.E;
            string f = cfg.F;
            Console.WriteLine("{0} {1} {2} {3} {4} {5}", a, b, c, d, e, f);
        }
    }
}

Class names is picked by the !! tag in the yaml *and* the tag to type mapping in the code.

Several YAML libraries exist for Python including PyYAML this example will use.

It should be possible to do binding with just PyYAML, but it is a easier to use yamlable on top of PyYAML.

!yamlable/yamlproc.Config
ab: !yamlable/yamlproc.AB
  a: 1
  b: 2
c: 3
de: !yamlable/yamlproc.DE
  d: ABC
  e: DEF
f: GHI
import yaml
from yamlable import yaml_info, YamlAble

@yaml_info(yaml_tag_ns='yamlproc')
class AB(YamlAble):
    def __init__(self,a=0,b=0):
        self.a = a
        self.b = b

@yaml_info(yaml_tag_ns='yamlproc')
class DE(YamlAble):
    def __init__(self,d='',e=''):
        self.d = d
        self.e = e

@yaml_info(yaml_tag_ns='yamlproc')
class Config(YamlAble):
    def __init__(self,ab=None,c=0,de=None,f=''):
        self.ab = ab
        self.c = c
        self.de = de
        self.f = f

cfg = yaml.load(open('/work/demo2python.yaml', 'r'), Loader=yaml.FullLoader)
a = cfg.ab.a
b = cfg.ab.b
c = cfg.c
d = cfg.de.d
e = cfg.de.e
f = cfg.f
print('%d %d %d %s %s %s' % (a,b,c,d,e,f))

Class names is picked by the ! tag in the yaml *and* the annotations on the classes in the code.

Article history:

Version Date Description
1.0 June 3rd 2021 Initial version

Other articles:

See list of all articles here

Comments:

Please send comments to Arne Vajhøj