In our last post, we focused on building our FileSpec and FieldSpec structures in order to define the file type and fields in our file format. We added ValidatorSpec objects to describe the validators we wanted, but we did not translate those into executable code. Now, to round out our FileSpec processing, we need to look at how to translate our validator specifications into behavioral objects. I will be taking the same approach here that I took with FileSpec and FieldSpec readers. We need to build the bridge that translates a structural Validator specification like this:

new ValidatorSpec 
{ 
    Type = ValidatorType.MaxLength,
    Parameters = new List
    {
        new ValidatorSpecParameter { Name = "MaxCharacterCount", Value = 100}
  }
}

Into a behavioral object like this:

public class MaxLengthValidator : IValidator
{
    public int MaxCharacterCount { get; private set; }

    public MaxLengthValidator(int maxCharacterCount)
    {
        MaxCharacterCount = maxCharacterCount;
    }

    public bool Check(string value)
    {
        if (string.IsNullOrWhiteSpace(value))
        {
            return true;
        }

        return value.Length <= MaxCharacterCount;
    }
}

Let's start with the MaxLengthValidator test case and then expand from there. We will need some code that can interpret the specification for a ValidatorSpec with Type ValidatorType.MaxLength and correctly instantiate our MaxLengthValidator object.

For this, we'll create a ValidatorFactory. The initial version looks like this:

public class ValidatorFactory
{
    public IValidator CreateValidator(ValidatorSpec validatorSpec)
    {
        switch (validatorSpec.Type)
        {
            case ValidatorType.MaxLength:
                int maxCharacterCount = GetIntParameter(validatorSpec, "MaxCharacterCount");
                return new MaxLengthValidator(maxCharacterCount);
            default:
                throw new ArgumentException($"Unable to create validator instance for type: {validatorSpec.Type}");
        }
    }

    private int GetIntParameter(ValidatorSpec validatorSpec, string parameterName)
    {
        var intParam = validatorSpec.Parameters.SingleOrDefault(p => p.Name == parameterName);

        if (intParam == null)
        {
            throw new ArgumentException($"Cannot find ValidatorSpecParameter with Name: {parameterName}");
        }

        return Convert.ToInt32(intParam.Value);
    }
}

What we have is a switch statement which inspects the Type property of our ValidatorSpec and then use that information to construct the correct runtime type. Our tests to validate this behavior:

[Test]
public void CanCreateMaxLengthValidator()
{
    var validatorSpec = new ValidatorSpec
    {
        Type = ValidatorType.MaxLength,
        Parameters = new List
            {
                new ValidatorSpecParameter { Name = "MaxCharacterCount", Value = 100}
            }
    };

    var validator = new ValidatorFactory().CreateValidator(validatorSpec);

    Assert.True(validator is MaxLengthValidator);
    Assert.AreEqual(100, (validator as MaxLengthValidator).MaxCharacterCount);
}

[Test]
public void WhenUnknownValidatorType_ThrowsArgumentException()
{
    var validatorSpec = new ValidatorSpec
    {
        Type = ValidatorType.None
    };

    Assert.Throws(() =>
    {
        new ValidatorFactory().CreateValidator(validatorSpec);
    });
}

How would the original switch statement change if we added the ability to construct validators that are ValidatorType.Required? Our updated CreateValidator method would look like this:

switch (validatorSpec.Type)
{
    case ValidatorType.Required:
        return new RequiredValidator();

    case ValidatorType.MaxLength:
        int maxCharacterCount = GetIntParameter(validatorSpec, "MaxCharacterCount");
        return new MaxLengthValidator(maxCharacterCount);

    default:
        throw new ArgumentException($"Unable to create validator instance for type: {validatorSpec.Type}");
}

And our new RequiredValidator class, which also implements the IValidator interface, looks like this:

public class RequiredValidator : IValidator
    {
        public bool Check(string value)
        {   
            return !string.IsNullOrEmpty(value);
        }
    }

Earlier in this series, in Part 3, "Format Definition and Validation 1" specifically, we had organized our validator logic much differently. Instead of having individual implementations of an IValidator interface, we had one big class called DataRecordValidator. That was a temporary place to hold the validation behavior while I discussed how we might write code to execute the validations in our spec.

Ultimately, we don't want to have one big class that contains all the validation logic for a few reasons:

  1. We want to be able to easily extend the validators in the library and we want enable end-users to implement their own validators. In order to do this, we need to create an abstract interface IValidator, and put our logic in implementations of that interface. That way, we gain the ability to substitute any type of validator logic behind the IValidator interface.
  2. A big class with a lot of different types of validation logic is not very cohesive. It would result in large chunks of unrelated code being stored together, which would make that code vulnerable to changes made in unrelated logic.
  3. Using validator objects allows us to embed knowledge (data) in the validator instance, which reduces the complexity of the interfaces in our code. For example, the MaxLengthValidator is constructed with the maxCharacterCount parameter. That means that wherever this validator is called, we don't have to provide that parameter. Compare this with our DataRecordValidator implementation which required several arguments, including the max character count. Since we have a uniform interface IValidator that does not require these parameters to be passed in, we gain the ability to reference diverse validation logic through a single uniform interface.
  4. Addressing the above concerns results in code that illustrates the Single-Responsibility Principle and Open/Closed Principle. We are able to extend our validation capabilities without impacting existing validator types.
  5. We do still have the issue of the ValidatorType enum which is a centralized type that can't be easily extended. We will leave that for now and revisit it once we start opening up the validation logic for extension in later posts.

    We can now normalize our ValidCodeContentValidator and ValidDateContentValidator classes from part 3 and have them implement the same IValidator instance. For example, the changes to ValidCodeContentValidator are very minor:

    public class ValidCodeContentValidator : IValidator
    {
        private readonly IList codeList;
    
        public ValidCodeContentValidator(IList codeList)
        {
            this.codeList = codeList;
        }
    
        public bool Check(string value)
        {
            if (string.IsNullOrEmpty(value))
            {
                return false;
            }
    
            return codeList.Contains(value);
        }
    }
    

    Our ValidatorFactory now supports all of our known validator types. It is a good start, but it will be expanded in the future:

    switch (validatorSpec.Type)
    {
        case ValidatorType.Required:
            return new RequiredValidator();
    
        case ValidatorType.MinLength:
            var minCharacterCount = GetIntParameter(validatorSpec, "MinCharacterCount");
            return new MinLengthValidator(minCharacterCount);
    
        case ValidatorType.MaxLength:
            int maxCharacterCount = GetIntParameter(validatorSpec, "MaxCharacterCount");
            return new MaxLengthValidator(maxCharacterCount);
    
        case ValidatorType.Format:
            string formatRegexPattern = GetStringParameter(validatorSpec, "FormatRegexPattern");
            return new FormatValidator(formatRegexPattern);
    
        case ValidatorType.Code:
            var codeListParam = GetStringParameter(validatorSpec, "CodeList");
            var codeList = codeListParam.Split(",").ToList();
            return new ValidCodeContentValidator(codeList);
    
        case ValidatorType.Date:
            var dateFormat = GetStringParameter(validatorSpec, "DateFormat");
            var culture = new CultureInfo(GetStringParameter(validatorSpec, "CultureName"));
            var minDate = GetDateTimeParameter(validatorSpec, "MinDate");
            var maxDate = GetDateTimeParameter(validatorSpec, "MaxDate");
            return new ValidDateContentValidator(dateFormat, culture, minDate, maxDate);
    
        default:
            throw new ArgumentException($"Unable to create validator instance for type: {validatorSpec.Type}");
    }
    

    This code definitely raises some questions. A few that occur to me right now:

    1. What if I want to supply my code list from a database or API call?
    2. How would I add a new type of Validator? As an end-user, how can I expand the list of available validator types to support my specific needs?
    3. As an end-user, how do I know what parameters are valid for each validator type?
    4. These are all important questions for us to answer and they will all be answered as we continue to build out the solution. For now, I'll leave these as open questions for the reader to consider.

      Now we have enough files in our ETElevate.Core project that it makes sense to create some folders to help with code organization. Our solution now looks like this:

      Lastly, I want to talk about how we can externalize this FileSpec into a JSON file and read it from the file into our system. Let's write a unit test to serialize and then deserialize the current FileSpec object into JSON. We can then compare the result of deserialization with the original input to confirm that we are able to represent the FileSpec format in JSON.

      [Test]
      public void CanSerializeAndDeserializeWithJson()
      {
          var json = JsonConvert.SerializeObject(communityLabsFileSpec);
          var deserializedFileSpec = JsonConvert.DeserializeObject(json);
      
          Assert.AreEqual(communityLabsFileSpec.FileType, deserializedFileSpec.FileType);
          Assert.AreEqual(communityLabsFileSpec.FirstLineIsColumnHeaders, deserializedFileSpec.FirstLineIsColumnHeaders);
          Assert.AreEqual(communityLabsFileSpec.FieldSpecs.Count, deserializedFileSpec.FieldSpecs.Count);
      
          for (var i = 0; i < communityLabsFileSpec.FieldSpecs.Count; i++)
          {
              Assert.AreEqual(communityLabsFileSpec.FieldSpecs[i].Name, deserializedFileSpec.FieldSpecs[i].Name);
              
              for (var j = 0; j < communityLabsFileSpec.FieldSpecs[i].ValidatorSpecs.Count; j++)
              {
                  Assert.AreEqual(communityLabsFileSpec.FieldSpecs[i].ValidatorSpecs[j].Parameters.Count, deserializedFileSpec.FieldSpecs[i].ValidatorSpecs[j].Parameters.Count);
      
                  for (var k = 0; k < communityLabsFileSpec.FieldSpecs[i].ValidatorSpecs[j].Parameters.Count; k++)
                  {
                      Assert.AreEqual(communityLabsFileSpec.FieldSpecs[i].ValidatorSpecs[j].Parameters[k].Name,
                          deserializedFileSpec.FieldSpecs[i].ValidatorSpecs[j].Parameters[k].Name);
      
                      Assert.AreEqual(communityLabsFileSpec.FieldSpecs[i].ValidatorSpecs[j].Parameters[k].Value,
                          deserializedFileSpec.FieldSpecs[i].ValidatorSpecs[j].Parameters[k].Value);
                  }
              }
          }
      }
      

      Now, let's wrap up this discussion by defining our complete file format (as much as we can right now), externalizing it in a config file, and then running a test file through the program.

      After building up the entire spec for our sample file format in code (it's too much code to paste here), we can use JsonConvert from Newtonsoft.Json to serialize the format to a JSON blob. My preference is to see enum strings instead of int values in our JSON output, so we just need to add the following serialization attribute to our enum classes.

      [JsonConverter(typeof(StringEnumConverter))]
      public enum FileType
      {
          None = 0,
          CommaSeparatedValues = 1
      }
      

      The output of our serialization is a text config file that can be used to specify our test file format (Note: I have excluded the comparison validator on Result Date since it isn't implemented yet.):

      {
        "FileType": "CommaSeparatedValues",
        "FirstLineIsColumnHeaders": true,
        "FieldSpecs": [
          {
            "Name": "First Name",
            "ValidatorSpecs": [
              {
                "Type": "Required",
                "Parameters": [
      
                ]
              },
              {
                "Type": "MaxLength",
                "Parameters": [
                  {
                    "Name": "MaxCharacterCount",
                    "Value": 100
                  }
                ]
              }
            ]
          },
          {
            "Name": "Last Name",
            "ValidatorSpecs": [
              {
                "Type": "Required",
                "Parameters": [
      
                ]
              },
              {
                "Type": "MaxLength",
                "Parameters": [
                  {
                    "Name": "MaxCharacterCount",
                    "Value": 100
                  }
                ]
              }
            ]
          },
          {
            "Name": "Date of Birth",
            "ValidatorSpecs": [
              {
                "Type": "Required",
                "Parameters": [
      
                ]
              },
              {
                "Type": "Format",
                "Parameters": [
                  {
                    "Name": "FormatRegexPattern",
                    "Value": "\\d{2}/\\d{2}/\\{d}4"
                  }
                ]
              }
            ]
          },
          {
            "Name": "Patient ID",
            "ValidatorSpecs": [
              {
                "Type": "Required",
                "Parameters": [
      
                ]
              },
              {
                "Type": "MaxLength",
                "Parameters": [
                  {
                    "Name": "MaxCharacterCount",
                    "Value": 15
                  }
                ]
              },
              {
                "Type": "Format",
                "Parameters": [
                  {
                    "Name": "FormatRegexPattern",
                    "Value": "899\\d{12}"
                  }
                ]
              }
            ]
          },
          {
            "Name": "Observation Date",
            "ValidatorSpecs": [
              {
                "Type": "Required",
                "Parameters": [
      
                ]
              },
              {
                "Type": "Format",
                "Parameters": [
                  {
                    "Name": "FormatRegexPattern",
                    "Value": "\\d{2}/\\d{2}/\\{d}4"
                  }
                ]
              }
            ]
          },
          {
            "Name": "Result Date",
            "ValidatorSpecs": [
              {
                "Type": "Required",
                "Parameters": [
      
                ]
              },
              {
                "Type": "Format",
                "Parameters": [
                  {
                    "Name": "FormatRegexPattern",
                    "Value": "\\d{2}/\\d{2}/\\{d}4"
                  }
                ]
              }
            ]
          },
          {
            "Name": "Lab Test Type",
            "ValidatorSpecs": [
              {
                "Type": "Required",
                "Parameters": [
      
                ]
              },
              {
                "Type": "MaxLength",
                "Parameters": [
                  {
                    "Name": "MaxCharacterCount",
                    "Value": 15
                  }
                ]
              },
              {
                "Type": "Code",
                "Parameters": [
                  {
                    "Name": "CodeList",
                    "Value": "83036,82947,83721,83718,82465"
                  }
                ]
              }
            ]
          },
          {
            "Name": "Result Value",
            "ValidatorSpecs": [
              {
                "Type": "Required",
                "Parameters": [
      
                ]
              },
              {
                "Type": "MaxLength",
                "Parameters": [
                  {
                    "Name": "MaxCharacterCount",
                    "Value": 50
                  }
                ]
              }
            ]
          }
        ]
      }
      

      Now that we have this file, we can write a test to read in the file, initialize a reader, and then use that reader to create DataRecord objects from the file. This code will go directly into our test for now, but we will soon extract this into its own reusable component.

      [Test]
      public void CanBuildReaderAndReadDataRecordsFromJsonConfigurationFile()
      {
          var config = File.ReadAllText("TestDataFiles\\CommunityLabsFormat.json");
          var fileSpec = JsonConvert.DeserializeObject(config);
          var reader = new FileReaderFactory().CreateFileReader(fileSpec);
      
          using (var stream = new FileStream("TestDataFiles\\CommunityLabResults.csv", FileMode.Open))
          {
              using (var streamReader = new StreamReader(stream))
              {
                  var recordCount = 0;
      
                  while (!streamReader.EndOfStream)
                  {
                      recordCount++;
                      TestContext.Write($"Record: {recordCount}\t");
      
                      var dataRecord = reader.ReadNextDataRecord(streamReader);
                      TestContext.Write($"First Name: {dataRecord.GetValue("First Name")}\t");
                      TestContext.Write($"Last Name: {dataRecord.GetValue("Last Name")}\t");
                      TestContext.Write($"Date of Birth: {dataRecord.GetValue("Date of Birth")}\t");
                      TestContext.Write($"Patient ID: {dataRecord.GetValue("Patient ID")}\t");
                      TestContext.Write($"Observation Date: {dataRecord.GetValue("Observation Date")}\t");
                      TestContext.Write($"Result Date: {dataRecord.GetValue("Result Date")}\t");
                      TestContext.Write($"Lab Test Type: {dataRecord.GetValue("Lab Test Type")}\t");
                      TestContext.Write($"Result Value: {dataRecord.GetValue("Result Value")}\r\n");
                  }
      
                  Assert.AreEqual(5, recordCount);
              }
          }
      }
      

      The test file we are reading is a CSV that matches our file format but does not pass all the validations we've configured:

      "First Name","Last Name","Date of Birth","Patient ID","Observation Date","Result Date","Lab Test Type","Result Value"
      "John","Smith","01/01/1977","123456789","02/01/2020","02/02/2020","A1C","5.7"
      "John","Smith","01/01/1977","123456789","02/01/2020","02/02/2020","Fasting Glucose","100mg"
      "John","Smith","01/01/1977","123456789","02/01/2020","02/02/2020","Cholesterol LDL","127mg"
      "John","Smith","01/01/1977","123456789","02/01/2020","02/02/2020","Cholesterol HDL","62mg"
      "John","Smith","01/01/1977","123456789","02/01/2020","02/02/2020","Cholesterol Total","189mg"
      

      There is one header record and five data records. The only thing we're asserting in the test above is that we have read five data records. The output of the TestContext.Write calls is more interesting:

      Record: 1	First Name: "John"	Last Name: "Smith"	Date of Birth: "01/01/1977"	Patient ID: "123456789"	Observation Date: "02/01/2020"	Result Date: "02/02/2020"	Lab Test Type: "A1C"	Result Value: "5.7"
      Record: 2	First Name: "John"	Last Name: "Smith"	Date of Birth: "01/01/1977"	Patient ID: "123456789"	Observation Date: "02/01/2020"	Result Date: "02/02/2020"	Lab Test Type: "Fasting Glucose"	Result Value: "100mg"
      Record: 3	First Name: "John"	Last Name: "Smith"	Date of Birth: "01/01/1977"	Patient ID: "123456789"	Observation Date: "02/01/2020"	Result Date: "02/02/2020"	Lab Test Type: "Cholesterol LDL"	Result Value: "127mg"
      Record: 4	First Name: "John"	Last Name: "Smith"	Date of Birth: "01/01/1977"	Patient ID: "123456789"	Observation Date: "02/01/2020"	Result Date: "02/02/2020"	Lab Test Type: "Cholesterol HDL"	Result Value: "62mg"
      Record: 5	First Name: "John"	Last Name: "Smith"	Date of Birth: "01/01/1977"	Patient ID: "123456789"	Observation Date: "02/01/2020"	Result Date: "02/02/2020"	Lab Test Type: "Cholesterol Total"	Result Value: "189mg"
      

      You can see that we've read the data into the correct field position and name. But what about validation? We are not executing any validation rules yet. That will take shape in our next post, where we will discuss how to evaluate the validation rules, produce error messages, and report them to the end user.

      There are a few issues with the code at this point which I want to call out. We will address them in the next post, but it's worth noting them now:

      1. Our IValidator interface does not support reporting error messages
      2. Our IValidator interface cannot support comparison validators (it is single-value only)
      3. We have no way to register new Validator types without modifying code
      4. We need to start building a command line interface so that we can invoke our logic from an end-user's perspective.
      5. All of these concerns should be addressed in our next post.

        Browse the GitHub Repository at this point in its commit history

        ETElevate GitHub Repository Home

        Thank you for reading!