Define the shape and behavior of our data.
Data Design in Overseed revolves around schemas.
To generate realistic data in Overseed, first, we must design a schema using attributes.
An attribute is a key-value pair, where the key is the attribute name and the value is either a constant-value, type, specification, or operation.
The attributes define the shape of the data, while the attribute values define the behavior.
// constant value
street: "123 East Street"
// type value
random_int: int
// int constant values
val_one: 1
val_two: 2
// operations on constant values
isEqual: val_one == val_two // false
sum: val_one + val_two // 3
// id starts at a value of 99 and is incremented by 1
id: #SpecNumberStep & {
value: 99
step: 1
}
A schema is a collection of attributes.
// address book example schema
id: #SpecNumberStep & { value: 1, step: 1 }
person: {
first_name: #SpecFakeType & { fakename: "firstname"}
last_name: #SpecFakeType & { fakename: "firstname"}
age: >=18 & <=75 & int
}
address: {
street: "123 East Street"
state: #SpecProbability & {
values: ["CA", "MA", "NY"]
probabilities: [43, 25, 32]
}
}
As we've just seen, just like in JSON, attributes can be nested. Here is a different way to structure the above schema.
// nested attributes
person: {
// person.id
id: #SpecNumberStep & { value: 99, step: 1 },
// person.first_name
first_name: #SpecFakeType & { fakename: "firstname"}
// person.last_name
last_name: #SpecFakeType & { fakename: "firstname"}
// person.age
age: >=18 & <=75 & int
address: {
// person.address.street
street: "123 East Street"
// person.address.state
state: #SpecProbability & {
values: ["CA", "MA", "NY"]
probabilities: [43, 25, 32]
}
}
}
Let's do one more using definitions!
#MyDefinition
.// declare address and person definitions
#Address: {
street: "123 East Street"
state: #SpecProbability & {
values: ["CA", "MA", "NY"]
probabilities: [43, 25, 32]
}
}
#Person: {
first_name: #SpecFakeType & { fakename: "firstname"}
last_name: #SpecFakeType & { fakename: "firstname"}
#Address // definitions don't require a key
}
// here we define our output by declaring an attribute and definition
id: #SpecNumberStep & { value: 99, step: 1 }
#Person // use #Person definition, returns: first_name, last_name, street, state
As we've mentioned before, attribute values determine the behavior of the data.
How many data behaviors exist today?
There are a few ways we can define behavior in Overseed:
Need a behavior that's not listed? Contact us at support@overseed.io.
Overseed can convert many of the types and operations available in CUE.
For more on the available types, see the Types section.
// attribute and type
a_string_attribute: string // return a random string
a_number_range_attribute: >=-1 & <=1 & int // return an int chosen randomly from [-1, 0, 1]
an_object_list_attribute: [{num: 1, str: "hello"}, {num: 2, str: "world!"}] // return an object from this value list
// data sample
[
{
"a_number_range_attribute": -1,
"a_string_attribute": "exercitationem",
"an_object_list_attribute": { "num": 2, "str": "world!" }
},
{
"a_number_range_attribute": -1,
"a_string_attribute": "reprehenderit",
"an_object_list_attribute": { "num": 1, "str": "hello" }
},
]
Overseed adds operations so we can work with the data values output from a field.
Operations allow us to do things such as combine strings, add numbers, or reference fields in objects to create relationships (e.g. foreign keys).
Let's see an example using References, String Concatenation, and Mulitplication of fields:
// vars object
my_vars: {
random_number_list: [1, 2, 3] & [...int]
random_salutation: ["Hello", "Welcome"]
}
// constants object
my_constants: {
const_num: 3
const_name: "Ada"
}
// results
concat_result: my_vars.eval.random_salutation + " " + my_constants.const_name // "Hello Ada" or "Welcome Ada"
add_result: my_vars.eval.random_number_list * my_constants.const_num // 1 * 3 or 2 * 3 or 3 * 3
For more on the available operations, see the following sections:
Overseed adds reusable types with behaviors called specifications that we can assign to our attributes.
Think of a specification like a struct or a class, with parameters that specify certain behaviors. Overseed then converts these specifications into data.
For more on the available specifications, see the Specifications section.
Specifications should be named when they are on the same level as other objects. Otherwise, the first defined #Spec
will take
over the parent object.
The example below shows a person object with two attributes.
card_purchase_state
.// specifcation without name taking over the structure
card: {
#SpecFakeType & { // since spec is not named, card will output as the only field with credit card numbers.
fakename: "creditcardnumber"
}
card_purchase_state: #SpecProbability & { // this and any following attributes will be ignored.
values: ["CA", "MA", "NY"]
probabilities: [43, 25, 32]
}
}
card
becomes of type #SpecFakeType with creditcardnumber, because it is not named.card
with credit card numbers.card_purchase_state
and any other attributes (of any type) that follow it will be ignored. The process ignores them because the person object will process according to the first defined spec.Below is a well-formed card number and purchase state with object with two named specs.
// card number and card purchase state
card_number: #SpecFakeType & { // named, will output card_number field
fakename: "creditcardnumber"
}
card_purchase_state: #SpecProbability & { // named, will output card_purchase_state field
values: ["CA", "MA", "NY"]
probabilities: [43, 25, 32]
}
card_number
and card_purchase_state
.// faketype and probability specification
card_number: #SpecFakeType & {
fakename: "creditcardnumber"
}
card_purchase_state: #SpecProbability & {
values: ["CA", "MA", "NY"]
probabilities: [43, 25, 32]
}
// data sample
[
{
"card_number": 371541075511513,
"card_purchase_state": "NY",
},
{
"card_number": 4386765607233982,
"card_purchase_state": "CA",
},
{
"card_number": 816452881196221,
"card_purchase_state": "MA",
},
]
OK, how do we create the data?
Next, In the Data Generation section, we look at how to generate and connect to our data.