I got this idea when I was working on a project that deals with a third party API. Actually my requirement was to update the database everytime if there is any new change with my third party API. So, it comes like this. I had a project that needed to sync the country list from a third party API. It is very simple as you know and the countries are not going to change in future. It is most likely the same country list. But the issue comes now. Sometimes they remove some countries from their list because they stop supporting that specific country. (Think the API is not providing a webhook).
So, now what can I do? Can I use the same country list and throw errors from my system everytime. No, I had to think about a solution to update the country list every time.
When I was thinking about a way to update the list, I found a better and efficient way to solve this problem. So, I will explain the steps I followed to solve this issue.
Before that you can ask why you can’t keep the country list retrieved from the third party api without saving the database. There are some pros and cons we have with that way. So, I will explain why I’m not going to choose that way.
Performance: every API call adds network latency. Application will feel slow to users if it has to wait for an external service just to populate a dropdown menu.
Reliability & Availability: If our third party API is temporarily down or unreachable, that part of our application will be broken.
API Rate Limiting: Most APIs enforce rate limits on how many requests we can make in a given period.
Cost: It’s a good practice to minimize API calls as some endpoints could be metered and charged on a per-call basis.
For these reasons, I chose to store the list of my data inside the database.
Ok, enough talking let’s move to the way I used to store API data now. This pattern is called caching semi-static data. I use a cronjob to schedule sync data every night from the API endpoint and update the database.
In these steps we use Upsert logic to correctly handle adding new countries and updating existing ones.
I used Java with Spring Boot for my example and this is how I manage to insert and update the list of API endpoint details.
try {
// 1. Fetch all countries from the API
List<CountryDto> apiCountries = apiClient.getAllCountries();
// 2. Get all existing countries from your database
List<Country> dbCountries = countryRepository.findAll();
Map<String, Country> dbCountryMap = dbCountries.stream()
.collect(Collectors.toMap(Country::getCountryIso, c -> c));
// 3. Mark all countries as inactive initially
for (Country country : dbCountries) {
country.setActive(false);
}
// 4. Loop through API results to update or insert (Upsert)
for (CountryDto apiCountry : apiCountries) {
Country country = dbCountryMap.get(apiCountry.getCountryIso());
if (country != null) {
// Country exists, update its details
country.setCountryName(apiCountry.getCountryName());
country.setRegion(apiCountry.getRegion());
country.setActive(true); // Mark as active
} else {
// New country, create a new entity
country = new Country();
country.setCountryIso(apiCountry.getCountryIso());
country.setCountryName(apiCountry.getCountryName());
country.setRegion(apiCountry.getRegion());
country.setActive(true);
}
countryRepository.save(country);
}
logger.info("Finished country synchronization. Synced {} countries.", apiCountries.size());
} catch (Exception e) {
logger.error("Failed to synchronize countries from API", e);
}
By flagging all local records as isActive = false at the start and then flipping the flag back to true for countries that are still in the API response, you effectively perform a "soft delete." Any country that is no longer in the API response will remain in your database as isActive = false. This is much safer than a hard DELETE, as you might have other records (like user transactions) that reference a country that has been removed.
Here I need to mention one thing. If you’re working with microservice related architecture, then you should be aware about updating the other related services according to this update and might not be the best solution for that.
What we can achieve using this method.
Decoupling & Resilience: Our application is no longer tightly coupled to the availability of the API provider for a core function. If their API is down for maintenance, our users can still see the list of countries because we are serving it from our own database.
Performance & User experience: Reading data from a local database is orders of magnitude faster than making a round-trip network call to an external API.
Data Integrity & Control: The "soft delete" logic (isActive flag) is crucial. It ensures that if API removes a country, you don't lose historical data integrity. For example, a transaction record from six months ago can still be associated with "Country X" even if that country is no longer supported for new transactions.
So, let’s discuss the above logic I have mentioned with Java. As I mentioned, I used the upsert logic to insert and update the database table. So, what is upsert logic?
It means,
If this record already exists in our database, update it with the new information. If this record does not exist, insert it as a new one.
This prevents errors (like trying to insert a duplicate record) and simplifies the logic of keeping one set of data (our database) in sync with another (the API).
How efficient this upsert logic works:
Fetch both complete lists
- One API call to get the complete list of countries from API
- One database query to get the complete list of countries we currently have stored locally.
It minimizes database operations to just one bulk read at the start and one batch of writes at the end, avoiding the "N+1" problem. The use of a map makes the core logic extremely fast.
2. Create a fast lookup Map
Looking up an item in a map by its key is almost instantaneous, no matter how large the map is. This is much faster than repeatedly scanning a list.
3. Iterate and Decide
Now we can iterate through each country using our list and can do the proper insert or update functionality.
4. Add the “Soft Delete” refinement
As I mentioned above, we use the “isActive” property to temporarily delete the item.
Now you can see how this mechanism is a more efficient way to update existing dataset on database.
Before closing the discussion, I need to mention a few more optimize steps that we can follow in an accurate way.
We can wrap the entire synchronization process in a single database transaction. Let’s imagine what could go wrong during the sync job if it is not transactional. What happens if a failure occurs halfway through? When the application crashes or the network connection to the database is lost? The job will stop with incomplete data.
Then how does this database transaction solve this.
If every single step inside the transaction succeeds, then all the changes are permanently saved (committed) to the database.
If even one step inside the transaction fails for any reason, then all the changes made since the beginning of the transaction are completely undone (rolled back).
So, you can get an idea about database transactions using this article.
Mastering database Transactions
Not only that, we can add another extra layer to make our logic more robust. Think If we have any failure when we connect with the third part API due to network issues or any other reason.
In that case, we do need to start the fetching process again automatically. For that actually we can use a retry mechanism. It should be a smart retry mechanism.
So we can use a package called Spring Retry to handle this situation.
The steps should be like below.
Start Job: The scheduler kicks off syncCountries().
Failure Occurs: A transient error (like a temporary database connection issue) throws a DataAccessException.
Retry Logic Kicks In:
Spring Retry catches the specific exception.
The transaction is rolled back, leaving the DB in its original state.
It waits for 5 seconds (delay).
It calls syncCountries() again (Attempt #2).
Another Failure: If it fails again, the transaction is rolled back, it waits 10 seconds (delay * multiplier), and tries again (Attempt #3).
Final Failure: If the 4th attempt fails, Spring Retry gives up.
Recover: The recoverFromSyncFailure() method is called. It logs a severe error and (ideally) triggers an alert so that an operator knows the sync has been failing and requires manual intervention.
Success: If any of the attempts succeed, the process stops and the recover method is never called.
@Transactional
@Retryable(
value = { DataAccessException.class, RestClientException.class },
maxAttempts = 4,
backoff = @Backoff(delay = 5000, multiplier = 2) // <-- Wait 5s, then 10s, then 20s
)
public void syncCountries() {
logger.info("Attempting to run country synchronization job...");
// The rest of sync logic...
// ...
logger.info("Country synchronization job completed successfully.");
}
@Recover
public void recoverFromSyncFailure(RuntimeException e) {
// This is the critical "fail gracefully" step.
logger.error(
"Country synchronization failed after all retry attempts.",
e
);
// Add alerting logic here!
}
Finally you can get a more robust logic by following these steps. With referring to the below diagram you can get a better idea about this complete solution.
Comments
Post a Comment