Chapter 04
Adding Rolling File Support
Note
The code for the entire chapter can be found at the src/chapter_04.5 directory
Since the beginning of Chapter 4, where we introduced logtar
, our own logging library, we have been discussing the rolling file support. This involves creating new files based on the rolling configuration set in config.json
or provided via the LogConfig.from_json
method. In this chapter, we will finish building this feature.
Rolling features
In our RollingConfig
class, we defined 2 private variables - #time_threshold
and #size_threshold
// file: lib/config/rolling-config.js
class RollingConfig {
#time_threshold = RollingTimeOptions.Hourly;
#size_threshold = RollingSizeOptions.FiveMB;
...
}
#time_threshold
This defines how often do we need to create a new file. For example, if the set value is RollingTimeOptions.Hourly
that means, we'll be creating a new log file every hour.
#size_threshold
This property tells what should be the maximum size of a log file. If it exceeds the size, we'll simply create a new log file.
Let's start coding.
The rolling_check() method
We're going to introduce a new private method Logger.rolling_check
, that is going to take care of the rolling logic.
// file: lib/logger.js
class Logger {
...
/**
* Checks if the current log file needs to be rolled over.
*/
async #rolling_check() {
const { size_threshold, time_threshold } = this.#config.rolling_config;
const { size, birthtimeMs } = await this.#log_file_handle.stat();
const current_time = new Date().getTime();
if (size >= size_threshold || (current_time - birthtimeMs >= time_threshold * 1000)) {
await this.#log_file_handle.close();
await this.init();
}
}
...
}
file_handle.stat()
Let's go through the code line by line:
const { size_threshold, time_threshold } = this.#config.rolling_config;
This line above destructures the size_threshold
and time_threshold
properties from the rolling_config
member object.
const { size, birthtimeMs } = await this.#log_file_handle.stat();
We're destructuring the size
and the birthtimeMs
properties from the file_handle.stat()
method.
Node.js provides us with a stat
method on the FileHandle
class, that can be used to check various stats of a file. The stat method provides a lot of useful information about the current file/file_handle. Some of them are methods, and some are properties.
Let's take a look at some of the useful information provided by file_handle.stat()
:
// Returns true if the `file_handle` is a file. Fails if the file handle is
// pointing to a directory instead
file_handle.stat().isFile();
// Returns true if this `file_handle` refers to a socket
file_handle.stat().isSocket();
// Numeric indentifier of the device which contains this file.
file_handle.stat().dev;
// The size of the file in bytes.
file_handle.stat().size;
// Time since this file was last accessed, in `ms`
file_handle.stat().atimeMs;
// Time since the file was last modified, in `ms`
file_handle.stat().mtimeMs;
// Time since the last time it's file status changed.
// File status is a two-byte code that indicates how a file operation completed;
// either successfully, or with some form of error.
file_handle.stat().ctimeMs;
// Time since this file was created, in `ms`
file_handle.stat().birthtimeMs;
// Same methods as above, but the value is in `ns` (nanoseconds)
// 1 ns = 1,000,000 ms
file_handle.stat().atimeNs;
file_handle.stat().mtimeNs;
file_handle.stat().ctimeNs;
file_handle.stat().birthtimeMs;
// Same methods as above, but the type of this is `Date`. The actual javascript Date class
file_handle.stat().atime;
file_handle.stat().mtime;
file_handle.stat().ctime;
file_handle.stat().birthtime;
I've omitted the ones that are not of any use for us.
const current_time = new Date().getTime();
We are creating a new Date
object and uses the getTime()
method to obtain the current time in milliseconds since epoch.
if (size >= size_threshold || (current_time - birthtimeMs >= time_threshold)) {
We'll be creating a new file:
- If the
size
of the log file is greater than or equal to thesize_threshold
, or - If the time difference between
current_time
andbirthtimeMs
(file creation time) is greater than or equal to thetime_threshold
.
await this.#log_file_handle.close();
Closes the file handle after waiting for any pending operation on the handle to complete. This is important. We should make sure if there are any pending operations, or the Node.js runtime hasn't flushed (written) the entire file, we're going to wait till then.
When everything is flushed, we're simply closing the file handle, so nothing in our application will access this file again.
await this.init();
We call the Logger.init
method to create the new file again. Here's the code for init()
just in case:
async init() {
const log_dir_path = check_and_create_dir("logs");
const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/g, "-") + ".log";
this.#log_file_handle = await fs.open(path.join(log_dir_path, file_name), "a+");
}
Calling the rolling_check method
Where are we going to call this method that we just created? Of-course, it's the Logger.#log
method.
class Logger {
...
async #log(message, log_level) {
if (log_level < this.#config.level || !this.#log_file_handle.fd) {
return;
}
await this.#rolling_check(); // Call the check method.
// I've extracted all the writing functionality into a separate method
await this.#write_to_handle(message, log_level);
}
...
}
Here's the abstracted functionality in the Logger.#write_to_handle
method.
class Logger {
...
async #write_to_handle(message, log_level) {
const date_iso = new Date().toISOString();
const log_level_string = LogLevel.to_string(log_level);
const log_message = `[${date_iso}] [${log_level_string}]: ${get_caller_info()} ${message}\n`;
await this.#log_file_handle.write(log_message);
}
...
}
Let's try to run the test.js
file again.
// file: test.js
const { LogConfig } = require("./lib/config/log-config");
const { Logger } = require("./lib/logger");
async function init() {
const logger = Logger.with_config(LogConfig.from_file("config.json"));
await logger.init();
return logger;
}
async function main() {
const logger = await init();
logger.info("Hello World!\n");
}
main();
Oops...
TypeError: Cannot read properties of undefined (reading 'split')
at get_caller_info (/Users/ishtmeet/Code/logtard/lib/utils/helpers.js:24:36)
at Logger.#write_to_handle (/Users/ishtmeet/Code/logtard/lib/logger.js:112:69)
at Logger.#log (/Users/ishtmeet/Code/logtard/lib/logger.js:89:36)
A big gotcha!
You might have guessed what part of the code crashed. Yes, that's the get_caller_info
function defined inside lib/utils/helpers.js
. That function is in-charge of collecting the stack trace and returning the line that includes the caller. Here's how that function looks:
// file: lib/utils/helpers.js
function get_caller_info() {
const error = {};
Error.captureStackTrace(error);
// this line fails.
const caller_frame = error.stack.split("\n")[4];
const meta_data = caller_frame.split("at ").pop();
return meta_data;
}
But why? Shouldn't this code work properly? Well, there's a huge part of async
in general that you should be aware of. For now let's add a console.log
to check that the stack trace is.
function get_caller_info() {
...
console.log(err.stack);
const caller_frame = error.stack.split("\n")[4];
...
}
Here's what the stack trace looks like:
Error
at get_caller_info (/Users/ishtmeet/Code/logtard/lib/utils/helpers.js:21:11)
at Logger.#write_to_handle (/Users/ishtmeet/Code/logtard/lib/logger.js:113:69)
at Logger.#log (/Users/ishtmeet/Code/logtard/lib/logger.js:90:36)
Where are the lines that refer to the logger.info
and the main()
function? They seem to be missing...
Stack traces across await points
An "asynchronous point" or an "await point" is a concept that refers to the moment when control is handed back to the JavaScript runtime during an asynchronous operation.
This point is marked by the await
keyword or when a promise's then()
or catch()
method is called. At an await
point, the runtime can execute other tasks while keeping track of the ongoing asynchronous operation. But this has something going behind the scenes.
When an async
function is invoked and it encounters an await
expression, the function execution is essentially paused and control is returned to the event loop. The event loop then continues processing other tasks in the queue. When the awaited asynchronous operation completes (e.g., a Promise resolves), the function is resumed from where it was paused.
The function kind of yields
back to the runtime, saying - "Hey, please continue with the event loop and continue this function from this line whenever you get that data".
When an error occurs within an asynchronous function, the stack trace that is captured includes information up to the point where the asynchronous operation was initiated (i.e., where the await
is placed). However, it does not include the full call stack leading up to the initial invocation of the asynchronous function. This is because the call stack gets unwound as control is returned to the event loop.
If an error occurs within an asynchronous function before the await
expression, the stack trace will include information about where the error happened and all the calling functions leading up to that point.
If an error occurs after the await
expression, the stack trace will only include information about the current function and the awaited function that was resumed after the asynchronous operation completed. It will not include the calling functions that led to the initial invocation of the asynchronous function.
This behavior is designed to make the program more efficient. But what's causing the issue in our code?
The culprit
// file: lib/logger.js
class Logger {
...
async #log(message, log_level) {
if (log_level < this.#config.level || !this.#log_file_handle.fd) {
return;
}
// This is causing the issue
await this.#rolling_check();
await this.#write_to_handle(message, log_level);
}
...
}
The async
method #rolling_check()
is the one that's causing the stack trace to trim all the required info.
When the runtime encounters the expression starting with await
, it pauses the execution of that function and "unwind the stack" in a non-blocking manner. This means that the method will yield control back to the event loop, allowing other tasks to be processed while the awaited asynchronous operation completes.
So, all the information about who invoked the #log
method is dropped. This is done to make the code more efficient. The more information the runtime needs to hold, the more memory it will occupy.
So how do we fix it? It's very simple. Write to the file_handle
before we do the rolling_check
. Here's the updated code:
// file: lib/logger.js
class Logger {
...
async #log(message, log_level) {
if (log_level < this.#config.level || !this.#log_file_handle.fd) {
return;
}
await this.#write_to_handle(message, log_level);
// Make sure we're doing the writing before checking for rolling creation
await this.#rolling_check();
}
...
}
Let's try to run test.js
again. That works!
[2023-08-21T23:28:59.178Z] [INFO]: Logger.info (/Users/ishtmeet/Code/logtard/lib/logger.js:128:18) Hello World!
This is still not accurate. It's telling that the function which was responsible for writing to the file was Logger.info
, however it is not correct.
Why is that? It's because we introduce a new method, this.#write_to_handle()
, so the line which contains our actual invocation went more line down. To illustrate, let's print our stack trace.
function get_caller_info() {
...
console.log(err.stack);
const caller_frame = error.stack.split("\n")[4];
...
}
The output is:
1 Error
2 at get_caller_info (/Users/ishtmeet/Code/logtard/lib/utils/helpers.js:21:11)
-> at Logger.#write_to_handle (/Users/ishtmeet/Code/logtard/lib/logger.js:113:69)
4 at Logger.#log (/Users/ishtmeet/Code/logtard/lib/logger.js:89:36)
5 at Logger.info (/Users/ishtmeet/Code/logtard/lib/logger.js:128:18)
6 at main (/Users/ishtmeet/Code/logtard/test.js:13:12)
The line marked with ->
shows the extra method which was pushed on the stack, because we abstracted all the functionality into that method. To fix that, we're going to update the code to get the 6th
line instead of the 5th
// file: lib/utils/helpers.js
function get_caller_info() {
...
const caller_frame = error.stack.split("\n")[5];
...
}
Let's test whether it works one last time:
[2023-08-21T23:37:11.919Z] [INFO]: main (/Users/ishtmeet/Code/logtard/test.js:13:12) Hello World!
Everything is working as expected.
Testing the new Log file creation
Update our RollingTimeOptions
class to include a new FiveSeconds
as a temporary variable, which we are going to remove after testing. Also make sure to add it inside the static assert()
method, or this method will reject a value it doesn't know.
// file: lib/utils/rolling-options.js
class RollingTimeOptions {
static FiveSeconds = 5; // Every 5 seconds
static Minutely = 60; // Every 60 seconds
...
static assert(time_option) {
// Add `this.FiveSeconds` for validation
if (![this.FiveSeconds, this.Minutely, this.Hourly, this.Daily, this.Weekly, this.Monthly, this.Yearly].includes(time_option)) {
throw new Error(
`time_option must be an instance of RollingConfig. Unsupported param ${JSON.stringify(time_option)}`
);
}
}
}
The config.json
should look like this
// file: config.json
{
"level": 0, // Log Debug messages and above
"log_prefix": "LogTar_",
"rolling_config": {
"size_threshold": 1048576, // 1 MB: 1024 * 1024
"time_threshold": 5 // Create a new file every 5 seconds
}
}
We're going to test the time_threshold
for now. I've set it to 5
seconds, so it creates a new log file every time the file_handle.stat().birthtimeMs > 5000
.
Inside test.js
run a log statement every 1 second:
// file: test.js
...
async function main() {
const logger = await init()
setInterval(() => {
logger.critical("This is critical");
}, 1000);
}
...
Run the program. You'll notice a new log file created every 5 seconds!
We are going to test the size_threshold
rolling option now. Remove the newly created FiveSeconds
static member from RollingTimeOptions
.
Update your code in test.js
:
// file: test.js
async function main() {
const logger = await init();
setInterval(() => {
logger.critical("Hi there");
}, 20); // run every 20 milliseconds
}
The config.json
should specify the size_threshold
as 20428
bytes (20 KB).
{
"level": 0,
"log_prefix": "LogTar_",
"rolling_config": {
// max size should be 20 KB (20 * 1024 bytes)
"size_threshold": 20480,
"time_threshold": 60
}
}
Run the test.js
script. You'll notice a new file created every now and then. Let's check the sizes of those log files.
$ ls -al ./logs
.rw-r--r-- 10k ishtmeet 22 Aug 06:11 Logtar_2023-08-22T00-41-58-132Z.log
.rw-r--r-- 10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-41-59-232Z.log
.rw-r--r-- 10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-01-101Z.log
.rw-r--r-- 10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-02-139Z.log
.rw-r--r-- 10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-03-545Z.log
.rw-r--r-- 10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-04-132Z.log
.rw-r--r-- 10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-05-382Z.log
.rw-r--r-- 10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-06-432Z.log
.rw-r--r-- 10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-07-109Z.log
.rw-r--r-- 3.6k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-08-138Z.log
Each file is only 10KB, which is exactly what we need. However, the last file is not 10KB because I had to exit the infinite loop caused by setInterval
by pressing Ctrl + C
.
We now have an actual logging library that can be used with any type of project. However, there is still something we need to take care of: providing a middleware function that can be used.
For instance, many Node.js backend frameworks, such as express, utilize the middleware pattern. In this pattern, incoming requests are passed through a series of functions (the middlewares), which utilize the request, and either reject it or forward it to the next middleware.
To use a logging library, we will print out certain information, such as the IP of the request, the query, pathname, or anything else.
Note
We will ensure that our library is compatible with other frameworks. However, we will address this after we have built enough of our backend framework.