Chapter 04
Writing logs
Note
The code for the entire chapter can be found at the src/chapter_04.2 directory
We've covered how to build the core utility helpers that will help us construct our logging library in a more modular way. However, we haven't gotten to the fun part of actually writing logs yet.
You might be wondering how we can call this a logging library if we haven't even learned how to write logs with it. Don't worry, we're about to get to that!
Before we begin writing logs to files, let's first introduce some concepts related to file-based logging and how we can make it more efficient and faster.
Re-using the File Handle
When logging to a file, it's important to manage the file handle efficiently. The file handle is the connection between our code and the log file on disk. Opening and closing the file for every log entry can be slow and use up resources. To avoid this, we want to reuse the file handle throughout the logging process.
To do this, we open the file handle once when we start the logging library and keep it open until the logging is done. This means we don't need to keep opening and closing the file, which can speed up our logging and save resources.
Log Rotation
Log rotation is a critical strategy in file-based logging to manage log files over time. As your application generates more log data, log files can become large and unwieldy. Log rotation involves creating new log files periodically or based on certain conditions, and optionally archiving or deleting older log files. This helps keep the log files at a manageable size, ensures easier log analysis, and prevents running out of disk space.
Asynchronous Logging
In network based apps, especially those with high levels of concurrency or those that involve asynchronous operations, implementing asynchronous logging can be beneficial, and that's the only reason we're using node:fs/promises
instead of node:fs
. Asynchronous logging ensures that the act of writing logs doesn't block or slow down the main execution thread of your application. This can prevent performance bottlenecks and maintain the responsiveness of your application.
When logging is performed synchronously, each log entry is written immediately to the log file or output, which can introduce delays and impact the overall performance of your application. Asynchronous logging, on the other hand, involves buffering log messages and writing them to the log file in batches or on a separate thread or process.
Note
We'll need to do more optimization than just using asynchronous file writing. Specifically, we should store the entire log contents in memory and write them periodically. This will make the process extremely fast and ensure that it doesn't consume too much memory.
By decoupling the logging process from the main application logic, we can achieve several advantages:
Improved Performance: Asynchronous logging allows the main application thread to continue executing without waiting for log writes to complete. This can be crucial for applications that require high responsiveness and throughput.
Reduced I/O Overhead: Writing log messages to disk can be an I/O-intensive operation. By batching multiple log messages together and writing them in one go, you reduce the frequency of disk I/O operations, which can improve efficiency.
Better Resource Utilization: Asynchronous logging allows you to optimize resource utilization, such as managing concurrent log writes, handling errors without disrupting the main flow, and efficiently managing file handles.
Enhanced Scalability: Applications with multiple threads or processes benefit from asynchronous logging because it minimizes contention for resources like the log file. This is particularly valuable in scenarios where multiple components are concurrently generating log messages
Getting Caller Information (Module and Line Number)
Including caller information, such as the file name and line number from which a log message originated, can significantly enhance the effectiveness of our logging library. This feature provides contextual insight into where specific events occurred in the codebase, making it easier to identify the source of issues and troubleshoot them.
When an application encounters an error or unexpected behavior, having access to the module and line number associated with the log message allows developers to:
- Quickly locate the exact location in the code where the event occurred.
- Understand the sequence of events leading up to the issue.
- Make precise code adjustments to fix problems.
Implementing this feature might involve using techniques from the programming language's stack trace or introspection mechanisms. Here's a high-level overview of how you could achieve this:
Capture Caller Information: When a log message is generated, our logging library will retrieve the caller information, including the module and line number. We'll learn it in a bit that how can we do that.
Format the Log Message: Combine the captured caller information with the log message and other relevant details like timestamp and log level.
Output the Log Message: Write the formatted log message to the desired output destinations, ensuring that the caller information is included.
Enough of the theory, let's start writing logs.
Testing our current API
Let's start with building small features that we need to have on our Logger
class. Before that, we're going to do some testing on whatever we've built till now.
To do the testing, we'll create a new file test.js
and a config.json
. test.js
will hold our code that we write while making use of our logtar
library. The config.json
will used to store our config for LogConfig
as a json
object.
// test.js
const { Logger, LogConfig } = require("./index");
const logger = Logger.with_config(LogConfig.from_file("./config.json"));
The config.json
file has the following contents. You may try to tweak the values as well. Try putting in the values that aren't supported by us, and see whether the assert
methods that we created actually crash the program or not?
// config.json
{
"level": 3,
"file_prefix": "LogTar_",
"rolling_config": {
"size_threshold": 1024000,
"time_threshold": 86400
}
}
Try executing the test.js
file.
$ node test.js
Nothing happens. But this proves that the whole setup for our Logger
and other classes is working perfectly fine.
To check whether the config is being loaded properly, we can create a couple of getter methods on our Logger
class.
// file: lib/logger.js
class Logger {
...
get level() {
return this.#config.level;
}
get file_prefix() {
return this.#config.file_prefix;
}
get time_threshold() {
return this.#config.rolling_config.time_threshold;
}
get size_threshold() {
return this.#config.rolling_config.size_threshold;
}
...
}
Now, in test.js
print those out to the standard output.
$node test.js
## outputs
LogTar_
1024000
86400
3
Perfect! Everything works fine. But what happens if I try to put a value that's not supported by our Logger API? Let's change the time_threshold
to 400
in config.json
and re-run the app.
$node test.js
## outputs
Error: time_option must be an instance of RollingConfig. Unsupported param 400
at Function.assert (/Users/ishtmeet/Code/logtard/lib/utils/rolling-options.js:36:19)
Why? Let's take a quick look at our RollingTimeOptions
utility class
// file: lib/util/rolling-options.js
class RollingTimeOptions {
static Minutely = 60; // Every 60 seconds
... // Other options
// Throw an error when the user sets the value which isn't one of those
static assert(time_option) {
if (![this.Minutely, this.Hourly, this.Daily, this.Weekly, this.Monthly, this.Yearly].includes(time_option)) {
throw new Error(
`time_option must be an instance of RollingConfig. Unsupported param ${JSON.stringify(time_option)}`
);
}
}
}
You might argue that this isn't the best dev experience. However, I think it is. We should always constrain over how configurable our library is. You would hardly need any other duration other than we specified i.e Minutely
, Hourly
, Daily
etc. and anything more than Yearly
Let's change the time_threshold
in config.json
back to 86400
which means 1 day.
Implementing logging methods
Since our logger supports 5 types of log_level
s, let's write a public method for each one of those on our Logger
class.
// file lib/logger.js
class Logger {
...
debug(message) { console.log('Debug: %s', message) }
info(message) { console.log('Info: %s', message) }
warn(message) { console.log('Warn: %s', message) }
error(message) { console.log('Error: %s', message) }
critical(message) { console.log('Critical: %s', message) }
...
}
And in test.js
call these methods
// file: test.js
const { Logger, LogConfig } = require('./index')
const logger = Logger.with_config(LogConfig.from_file('./config.json'));
console.log(logger.file_prefix);
console.log(logger.size_threshold);
console.log(logger.time_threshold);
console.log(logger.level);
logger.debug('Hello debug');
logger.info('Hello info');
logger.warn('Hello warning');
logger.error('Hello error');
logger.critical('Hello critical');
// outputs
LogTar_
1024000
86400
3
Debug: Hello debug
Info: Hello info
Warn: Hello warning
Error: Hello error
Critical: Hello critical
DRY (Don't Repeat Yourself)
The "Don't Repeat Yourself" (DRY) principle is a basic concept in software development that promotes code reusability and maintainability. The idea behind DRY is to avoid duplicating code or logic in multiple places within your codebase. Instead, you aim to create a single source for a particular piece of functionality, and whenever you need that functionality, you refer to that source.
DRY encourages developers to write clean, efficient, and modular code by:
- Reducing the chances of errors: Duplicated code increases the chances of mistakes or bugs when changes are made in one place but not in others.
- Simplifying maintenance: When a change is required, you only need to update the code in one place, making it easier to keep your codebase up-to-date and consistent.
- Enhancing readability: Code that is free from unnecessary duplication is easier to understand and follow, making it more accessible to other developers.
Although following the DRY principle is generally beneficial, there can be situations where duplication might not necessarily be a bad thing. Not every instance of code repetition needs to be eliminated, and striving for absolute DRYness in all cases might lead to overcomplicated solutions or premature abstraction.
A guideline often mentioned is the "Rule of Three": If you find yourself repeating the same code or logic more than three times, it's a strong indication that you should consider refactoring that code into a reusable function, class, or module. This threshold helps you strike a balance between reusability and pragmatic simplicity.
Trying too hard to avoid repeating code can make it so complicated that it's hard to understand. It's better to find a balance where your code can be used again and again, but it's still easy to read and work with.
The log method
We'll introduce a private
member method called log
. The clients don't need to know about what's going inside the library, so making the crucial/core methods private is a nice thing.
Update the code of your Logger
class to include the log
method. Also, update the code so that all other helper methods call the log
method, thus avoiding the code duplication.
// lib/logger.js
const { LogLevel } = require("./utils/log-level");
class Logger {
...
#log(message, log_level) {
console.log('%s: %s', message, log_level)
}
debug(message) {
this.#log(message, LogLevel.Debug);
}
info(message) {
this.#log(message, LogLevel.Info);
}
warn(message) {
this.#log(message, LogLevel.Warn);
}
error(message) {
this.#log(message, LogLevel.Error);
}
critical(message) {
this.#log(message, LogLevel.Critical);
}
...
}
Executing the code gives us the desired output. The output includes the LogLevel
as an integer. However, that's not helpful at all, we should be showing what the LogLevel
is with a string representation of the level.
Let's introduce a new static method to_string
inside the LogLevel
class
// file: lib/utils/log-level.js
class LogLevel {
...
static to_string(log_level) {
const levelMap = {
[this.Debug]: "DEBUG",
[this.Info]: "INFO",
[this.Warn]: "WARN",
[this.Error]: "ERROR",
[this.Critical]: "CRITICAL"
};
if (levelMap.hasOwnProperty(log_level)) {
return levelMap[log_level];
}
throw new Error(`Unsupported log level ${log_level}`);
}
...
}
Change the code inside the log()
method of Logger
class.
// lib/logger.js
const { LogLevel } = require("./utils/log-level");
class Logger {
...
#log(message, log_level) {
console.log('%s: %s', message, LogLevel.to_string(log_level))
}
...
}
This outputs
Hello debug: DEBUG
Hello info: INFO
Hello warning: WARN
Hello error: ERROR
Hello critical: CRITICAL
Everything looks good.
Considering the log_level member variable
Notice that in our config.json
we specified that the log level should be 3
that is LogLevel.Error
. Specifying the log level means that we should only write logs that are equal to or above the specified level.
Imagine a production application, which is usually under a very heavy load. We'd like to specify the level
as LogLevel.Warn
or even LogLevel.Info
. We don't care about the LogLevel.Debug
logs at all. They might pollute the log files as the debug logs are usually very verbose.
We are going to add a small check to ignore the logs which are below the current log_level
.
// file: lib/logger.js
class Logger {
...
#log(message, log_level) {
if (log_level < this.#config.level) {
return;
}
console.log('%s: %s', message, LogLevel.to_string(log_level))
}
...
}
On running, we get the following output.
Hello error: ERROR
Hello critical: CRITICAL
We only write logs based off the current logger's log_level
Writing to a file
We've been using console.log()
to print the log messages to the standard output, or your terminal. However, there are many drawbacks to this approach.
- Lack of Persistence: Console logs are ephemeral and disappear once the application session ends (when you close the terminal or exit out of an ssh connection). This makes it challenging to review logs for past sessions.
- Performance Impact: Continuous console output can impact application performance, especially when generating a high volume of logs. It can slow down the application and interfere with its responsiveness. There are certain ways to get mitigate this, we'll talk about this in a later chapter.
Create a private log_file_handle
member
// file: lib/logger.js
const fs = require('node:fs/promises')
class Logger {
...
/**
* @type {fs.FileHandle}
*/
#log_file_handle;
...
}
The log_file_handle
member variable will be our opened file handle that we'll reference again and again whenever we wish to write logs to a file. It will hold the opened log file, which will be created whenever the client initiates our logging library.
We'll expose a public method on the Logger
called init
and make that async
so that it waits for the init()
to finish initializing the file handle before moving further.
// file: lib/logger.js
const fs = require('node:fs/promises')
class Logger {
...
#log_file_handle;
async init() {
const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/g, "-") + ".log";
this.#log_file_handle = await fs.open(file_name, "a+");
}
}
There's a lot going on in this method. Let's break it down.
A small primer on Regular Expressions
The
init
method is responsible for initializing the logger, which includes creating or opening a log file with a dynamically generated name based on the current date and time.this.#config.file_prefix
is used to prefix the name of the log file, which is actually set in theconfig.json
file or can be passed as a json object or utilizing theBuilder
pattern.new Date().toISOString()
generates a string representation of the current date and time in the ISO 8601 format, such as "2023-08-18T15:30:00.000Z"..replace(/[\.:]+/g, "-")
is a regular expression operation. Let's break down the regex:the square brackets [] are used to define a character class. A character class is a way to specify a set of characters from which a single character can match. For example:
[abc] matches any single character that is either 'a', 'b', or 'c'.
[0-9] matches any single digit from 0 to 9.
[a-zA-Z] matches any single alphabetical character, regardless of case.
You can also use special characters within character classes to match certain character types, like
\d
for digits,\w
for word characters,\s
for whitespace, etc.In this case we are looking for all the dot (
.
) characters and colon (:
) characters in the string.
\.
matches a literal dot character. Since the dot (.
) is a special character in regular expression, known as a wildcard. It matches any single character except for a newline character (\n
). This is useful when you want to match any character, which is often used in pattern matching.However, in this case, we want to match a literal dot character in the string (the dot in the date-time format "00.000Z"). To achieve this, we need to escape the dot character by preceding it with a backslash (
\
). When you place a backslash before a special character, it indicates that you want to match the literal character itself, rather than its special meaning.+
matches one or more of any character except newline. We match for all the characters following the dot (.) and the (:) character./g
is the global flag, indicating that the replacement should be applied to all occurrences of the pattern.So, the regex
[\.:]+
matches the dot or colon and all the repeated occurences of these 2 characters.The replacement
"-"
removes the dot and all characters after it and replaces it with a hyphen (-
).
The result of the
replace
operation is a modified version of the ISO string, which now includes only the date and time portion, without the fractional seconds..log
is appended to the modified date and time string to form the final log file name.await fs.open(file_name, "a+")
opens or creates the log file using thefs.open
function with "a+" flag. We talked about the modes in a previous chapter- If the file doesn't exist, it will be created.
- If the file exists, data can be appended to it.
- The
"a+"
mode allows both reading and appending. Appending means, we begin writing to the end of the file instead of from the 1st line. However, if the file is empty, it starts from the beginning.
This code initializes the logger by creating or opening a log file with a name based on the current date. The regular expression is used to remove the fractional seconds from the ISO date string, and the resulting string is used as part of the log file name. This allows for creating a new log file every time the init
method is called, typically representing logs for a specific time period.
Note
Since regular expressions are one of the most important concepts which are usually neglected, we'll have an entire chapter dedicated to mastering regular expressions in case you'd like to get comfortable using them.
Testing the log file creation
Let us try to test the init
method, and see if it creates a log file like we desire?
Before that, we'll look at a small nuance of calling an async
function inside a non-async
scope. Add the following log statement in the init
method
// file: lib/logger.js
class Logger {
...
async init() {
const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/g, "-") + ".log";
this.#log_file_handle = await fs.open(file_name, "a+");
console.log("File created.")
}
...
}
And then call the init
method from the test.js
// file: test.js
const { Logger, LogConfig } = require("./index");
const logger = Logger.with_config(LogConfig.from_file("./config.json"));
logger.init();
console.log("End of the file.");
However the program exits, without creating the log file.
$ node test.js
## Outputs
End of the file.
Why is it so? It is because we're using the node:fs/promises
module, where all the functions are asynchronous. That means, they do not block any code and continue execution without waiting. This is the key to creating performant concurrent applications.
How do we fix it? Simply by await
ing the init()
method.
// file: test.js
const { Logger, LogConfig } = require("./index");
async function main() {
const logger = Logger.with_config(LogConfig.from_file("./config.json"));
await logger.init();
console.log("End of the file");
}
main();
If you try to run now, the output is what you would've expected.
## Outputs
End of the file
You'll also notice that the log file has also been created! For me the log file was created in the root directory. The name of the log file is
LogTar_2023-08-18T17-20-23-132Z.log
LogTar_
is the prefix that I specified in the config.json
file. Following it is the timestamp of when the file was created. We also add a .log
extension at the end. All working as expected.
Another gotcha
If you're paying attention, you'll have already figured out that the way we're providing path to a file is not a nice way. If we wanted to run the test.js
file from a different directory, say lib/config
we'll get an error.
$ cd lib/config && node ../../test.js
// outputs
Error: ENOENT: no such file or directory, open './config.json'
Let's fix this by using the __dirname
global variable that we used earlier.
// file: test.js
const path = require("node:path");
const { Logger, LogConfig } = require("./index");
async function main() {
const logger = Logger.with_config(LogConfig.from_file(path.join(__dirname, "config.json")));
await logger.init();
console.log("End of the file");
}
main();
Now if we try to run it, it works irrespective of the directory you're running the code from.
$ node ../../test.js
## Outputs
End of the file
Logs directory configuration
When we run our code using the node ../../test.js
command, the log files are being created in the same place where we typed that command, which is the lib/config
folder. This is not what we want. Our library is a tool that other people will use. We should have complete say over where the log files go. This is because we want to make sure our library works well no matter how others use it.
Normally, logs (like records of what our code is doing) are put in a special folder called "log." This folder sits at the main spot of the entire project.
By centralizing the log files within the log
directory at the root of the project that includes our library, we achieve a more organized and modularized approach to logging. This ensures that the log files don't clutter the main project's structure and are conveniently located in one designated place.
Our first script
Inside the package.json
file, add a script start
with the value of node test.js
and get rid of the test
script.
// file: package.json
{
...
"scripts": {
"start": "node test.js"
},
...
}
If you run npm start
from any folder of the project, it runs the node test.js
command from the directory the package.json
file is located.
The logs output in the root of the directory. Let's create them inside the log
directory, instead of the root directory.
Update the code inside Logger.init
method:
// file: lib/logger.js
const path = require('node:path')
async init() {
const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/, "-") + ".log";
this.#log_file_handle = await fs.open(path.join('logs', file_name), "a+");
}
If you do run npm start
, it fails.
## Outputs
[Error: ENOENT: no such file or directory, open 'logs/LogTar_2023-08-18T19-14-46-162Z.log']
It's failing because it could not find the log
directory. If we mention something like this in our library's readme:
Hey please create a `log` folder in order for this library to work
They'll simply ignore the library and use something better. Let's make the whole process streamlined. Irrespective of where our library's code live, whether it's inside the node_modules
folder or in the same directory.
The require object
We've been using require
throughout our code, which just imports a module, and exposes all it's exported functionality - whether it's a variable, a function or a class. You may think of it as a function since require('node:path')
indeed looks like a function call.
But, require is not just a function!
require
also acts like an object that has several useful properties and methods associated with it. When you use require
in your code, you are utilizing both its function-like behavior to load modules and its object-like properties and methods to interact with modules and the module system.
For example:
require.resolve('module-name')
: Returns the path of the module without actually loading it.require.cache
: Provides access to the cache of loaded modules.require.main
: Provides access to theModule
object representing the entry script loaded when the Node.js process launched. This is exactly what we need.
The reason require
might feel like both an object and a function is because JavaScript allows functions to have properties. You can test this yourself
function my_fun() {
console.log("hi");
}
my_fun.hey = "there";
console.log(my_fun.hey); // there
my_fun(); // 'hi'
Adding a new helper to create log directory
Create a new file called helpers.js
inside the lib/utils
folder.
// file: lib/utils/helpers.js
const fs_sync = require("node:fs");
const path = require("path");
/**
* @returns {fs_sync.PathLike} The path to the directory.
*/
function check_and_create_dir(path_to_dir) {
const log_dir = path.resolve(require.main.path, path_to_dir);
if (!fs_sync.existsSync(log_dir)) {
fs_sync.mkdirSync(log_dir, { recursive: true });
}
return log_dir;
}
module.exports = {
check_and_create_dir,
};
Let's go through the check_and_create_dir
function's code line by line.
The
path.resolve()
function creates an absolute path by combining a sequence of paths or path segments.It processes the sequence from right to left, with each subsequent path added to the beginning until an absolute path is created. For example, if the path segments are
/foo
,/bar
, andbaz
, callingpath.resolve('/foo', '/bar', 'baz')
would return/bar/baz
because'/bar' + '/' + 'baz'
creates an absolute path, but'baz'
does not.If, after processing all the segments in the given path, an absolute path hasn't been created yet, then the current working directory is used instead.
The
require.main.path
holds the absolute path to the directory of the entry point. The entry point is thetest.js
in our case. Or whatever file you specify while runningnode file_name.js
command.The
fs_sync.existsSync
function is checking if a given directory path exists in the file system. It's using thelog_dir
path (the result of path resolution) to see if the directory is already there.If the directory doesn't exist, the
fs_sync.mkdirSync
function comes into play. It's used to create directories. The{ recursive: true }
option ensures that not only the final directory is created, but also any necessary intermediate directories along the path. In case we wish to specify a log directory aslogs/new/today
At the end, function returns the
log_dir
, which is the full path to the directory. Whether the directory existed before or was just created, this path is what you get as a result from the function.
Updating the init method
// file: lib/logger.js
class Logger {
...
async init() {
const log_dir_path = check_and_create_dir("logs")
const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/, "-") + ".log";
// Open/Create the file at the specified log directory and
// save the file handle to the `#log_file_handle` for further
// use
this.#log_file_handle = await fs.open(path.join(log_dir_path, file_name), "a+");
}
...
}
Completing the log method
The current version of our Logger.log
method looks like this:
##log(message, log_level) {
if (log_level < this.#config.level) {
return;
}
console.log('%s: %s', message, LogLevel.to_string(log_level))
}
Instead of logging to the standard output, let's write it to the log file that is created when the Logger
is initialized.
// file: lib/logger.js
class Logger {
...
async #log(message, log_level) {
// dont' write to the file if
// 1. The `log_level` argument is less than the `#config.level` value
// 2. If the `fd` (file descriptor) is either 0 or -1, which means the file
// descriptor is closed or not opened yet.
if (log_level < this.#config.level || !this.#log_file_handle.fd) {
return;
}
// write logs to the opened file
await this.#log_file_handle.write(log_message);
}
...
}
In the test.js
file
// file: test.js
const path = require("node:path");
const { Logger, LogConfig } = require("./index");
async function initialize_logger() {
let logger = Logger.with_config(LogConfig.from_file(path.join(__dirname, "config.json")));
await logger.init();
return logger;
}
async function main() {
let logger = await initialize_logger();
logger.error("This is an error message");
}
main();
In test.js
we create a new async
function initialize_logger
that creates a logger, init
ializes it and returns it.
We call the logger.error()
method to print the log to the file.
Let's run the code
$ npm start
A new log file will be created inside the logs
directory. Open and see the contents -
This is an error message
Perfect! Everything seems to be working now. The logs are being saved, and we can be proud of it! But wait, the logs aren't useful. We don't know what are those logs, when did we write them and what function called the logger.error()
method?
We'll take care of all this in the next chapter.
Note
The code for the entire chapter can be found at the src/chapter_04.2 directory