Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bogus can generate impossible DateTime values #365

Open
logiclrd opened this issue Mar 16, 2021 · 3 comments · May be fixed by #366
Open

Bogus can generate impossible DateTime values #365

logiclrd opened this issue Mar 16, 2021 · 3 comments · May be fixed by #366

Comments

@logiclrd
Copy link
Contributor

logiclrd commented Mar 16, 2021

Version Information

Software Version(s)
Bogus NuGet Package 31.0.3, 33.0.2
.NET Core?
.NET Full Framework? 4.8
Windows OS? 10
Linux OS?
Visual Studio? 16.9.0 Preview 5

What locale are you using with Bogus?

en-US

What is the expected behavior?

Generated DateTime values always represent realistic dummy data.

What is the actual behavior?

If the range of accepted values contains a Daylight Savings Time shift forward, sometimes Bogus can return impossible DateTime values. For instance, in my locale, dates between 2021-03-14 02:00:00 and 2021-03-14 02:59:59.9999999 don't normally exist, and supplying such values to formatting and serialization functions can cause unexpected behaviour. Bogus has caused automated tests to fail by returning such values for dummy data.

Please provide a stack trace.

n/a

Any possible solutions?

Detect and correct invalid values at every site that asks Bogus for a DateTime value. But, sometimes these sites are buried inside other libraries (e.g. AutoBogus).

How do you reproduce the ?

Ensure that the range requested includes impossible DateTime values and request enough values and eventually you'll get an impossible value.

One way in which these invalid values can cause real-world problems is in serialization. For instance, if an invalid value is put into a DataTable object, then when that DataTable is serialized and then deserialized, the deserialized copy will have a different DateTime value. In my usage so far, this is mostly a problem in automated testing, and automated testing is where Bogus is used as well.

Do you have a unit test that can demonstrate the bug?

			var faker = new Faker();

			const int SampleCount = 1_000_000;

			int badSampleCount = 0;

			var firstInvalidDateTime = new DateTime(2021, 3, 14, 2, 0, 0);
			var firstValidDateTime = new DateTime(2021, 3, 14, 3, 0, 0);

			for (int i = 0; i < SampleCount; i++)
			{
				var value = faker.Date.Between(new DateTime(2021, 3, 14), new DateTime(2021, 3, 15));

				if ((value >= firstInvalidDateTime) && (value < firstValidDateTime))
					badSampleCount++;
			}

			Console.WriteLine("Out of {0} samples, {1} were invalid ({2} %) for CDT", SampleCount, badSampleCount, badSampleCount * 100 / SampleCount);

Can you identify the location in Bogus' source code where the problem exists?

Any of the functions that return bogus DateTime values are susceptible if their range of values can include the spring DST switch in a time zone that does daylight savings. I believe these functions are all contained within the Date.cs Bogus data set type.

A complicating factor is that these values are invalid if the returned value is assumed to be in the local timezone and the local timezone includes a daylight savings switch. But, what if the caller wants DateTime values for a different timezone that does not have a daylight savings switch?

Perhaps it would be a reasonable assumption that the basic overloads should all return valid local DateTime values, and possibly consider adding UTC overloads, and possibly overloads where the caller can specify a TimeZoneInfo.

If the bug is confirmed, would you be willing to submit a PR?

If the desired path forward is clearly identified, I could probably make a PR.

@logiclrd
Copy link
Contributor Author

I wonder if this would work: Always convert the start & end DateTime values to UTC, generate the bogus value in that range as a UTC DateTime, and then convert that back to a local DateTime before returning it.

@logiclrd
Copy link
Contributor Author

I have made a PR that doesn't add any new overloads or functionality, and simply converts the date range to UTC before calculating a random value, then converts the random value back afterward, taking advantage of the framework's built-in DST transition calculations. I created a unit test that fails with the old code and succeeds with the new code and I'm pretty sure is in fact testing the intended functionality. That's in #366.

@bchavez
Copy link
Owner

bchavez commented Mar 17, 2021

Hi @logiclrd, thank you for the PR. I'll try to review the changes this weekend.

@bchavez bchavez linked a pull request Mar 21, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants