1

Adventures in Groovy – Part 54: Collections Are Your Best Friend

I received a number of questions when I attended KScope. A lot of people had a tough time wrapping their head around the fact that using Groovy could actually reduce cost and development time, and significantly decrease the effort to maintain an application. The feedback I get on my lightning lessons has been overwhelmingly positive. The basis of the concept of the use of Groovy making things simpler stems from understanding, using, and becoming confident using collections. I thought I would share some examples, use cases that you all use, and some samples to get those started. If you doubt Groovy can mean simpler and faster, I HOPE this will get you to at least think about coming over to the dark side and provide some perspective that will help.

I would like to create more of these and potentially move my training to be free. Before I can do that I need everybody to watch what I post AND subscribe! If I can generate enough of that, I can create enough add revenue to fun my effort and time. So subscribe, watch, and we will see what happens.

In the mean time, let me know what you think of having videos like this rather than traditional written content. If you would like to learn more about dynamic scripts for data maps, forms, and improving performance with Esssbase scripts by removing serial mode and threading the right way, head over to in2hyperion.podia.com. I have more lightening lessons ready for you to take advantage of!




Improve the Planner Experience with Groovy (ODTUG)

Due to the overwhelmingly positive feedback from my last KScope presentation, I was asked to repeat the presentation for ODTUG. First of all, I can’t think you all enough for the generous and kind feedback! If you would like to attend, I would love to have you there.



Event Information

You can register here. The event is on Wednesday, Dec 1, 2021 from 12:00 PM – 1:00 PM EST

Presentation Summary

You have heard of Groovy, and you probably have heard of how drastically it can improve performance and functionality. The functionality talked about largely misses the focus of planning and forecasting, which is the user experience.

This session will walk through real-word examples implemented at organizations which are focused specifically on improving the user experience. It will highlight how to improve data input and new functionality that can be added which focuses specifically on the planning process. The topics will include everything from improving data entry to identifying issues before they become issues and giving users functionality that was prior only available to administrators. Allowing users to manage metadata effectively, performing multiple transfers at one time in workforce planning, and allowing users to fix data load errors on the fly will all be covered.

Attendees will leave this session with the knowledge and examples that will allow them to go back to their workplaces and use what they learned to improve their applications. They will leave with ideas, specific use cases, and the Groovy code to implement them. If you think the cloud limits your ability to be creative, think again!




Efficiently Handling Creating Blocks

You known what they say? Don’t believe everything you read on the internet. Creating blocks with DATACOPY is only the second slowest way you can do it. Do you want to know how to create a calculation using a faster performing technique to create blocks while calculating the members in those blocks?



Introduction

I see hundreds of examples of how to deal with block creation. It almost always is an issue when I am asked to get involved in projects that have performance issues. You will never hear me say “best practice.” There are reasons we all do things that go against what Oracle will say is a best practice. We have larger block sizes to improve reporting performance and we know it causes slower calcs. If the calc runs at night nobody cares if it runs 10 minutes or 60 minutes. We go against the grain on dimension order when data is sparse and performance on calculations is improved by ignoring the “ideal” order.

Block creation can be a bear if you don’t understand what blocks are or don’t understand how they get created. There are a bunch of ways to do it and there are many options that can be used. I want to educate on what a block is and show how to handle it the most efficient way because 9 out of 10 people don’t know how to handle it efficiently. I don’t know why, because it is easy to figure out if you think about it and understand some basic concepts, and even easier to implement it.

If you have a better way to do it share your techniques. Hopefully I don’t say anything incorrectly and I shed some light on block creation for the majority of the people reading this.

What Is Block Creation

Essbase stores data in blocks. A block is the combination of dense dimensions. There is a block for every stored combination of sparse members. If you don’t know how blocks work, I highly recommend you read Sparse, Dense, and Blocks for Dummies. If a block doesn’t exist, it will only get created if you tell Essbase to create it. I will get into how to do this shortly.

When blocks don’t exist, a lot of calculations appear to work intermittently. If you have this issue, it is likely due to the fact that it works when blocks already exist and doesn’t when they don’t exist. You can easily tell by loading a number to one combination of the dense members and run the calc again. Submitting a value will create the block if it doesn’t exist. If it calc works after you submit some data, it is most likely the fact that the block was not there and is now.

Block creation is something you will likely encounter on every project. How do deal with it is what I want to tackle.

Using DATACOPY – What Most Teach You

Most people create blocks by using DATACOPY. I just read a post on LinkedIn that sparked my interest in writing this because I find it terribly frustrating when people/companies represent themselves as experts and teach people inefficient ways to do things. I have most certainly done this in my life in an effort to help people, but this topic is so basic and polarizing for me because it is SO EASY to do it right. By right, I mean less code, faster execution, and easier to write in the first place. Here is likely what you have heard.

In this example, marketing expenses needs to be calculated as 5% over actuals. The method suggested is to copy data from Actual to Budget, which duplicates every Actual block for Budget. Then you wipe out the data in Budget. Then you set the data with your calculation to make Budget 5% more than Actual. It looks something like, excluding the FIX statements.

DATACOPY "Actual" to "Budget";
"Budget" = #Missing;
"Budget" = "Actual" * 1.05;

To reiterate, if you would run a calculation on Budget and no blocks existed, it would likely do nothing. If the calculation FIXes on Budget, and no blocks exist for Budget, it does nothing. If it already has some of the needed blocks, the calculation will appear to work on some of the data but not on others. To get around this, people will copy all the data from Actuals to Budget so all the needed blocks exist. Then they will clear the data from Budget. Then they will loop through Budget and run the logic. The result is this.

FIX([your fix statement], "Marketing Expenses")
  DATACOPY "Actual" to "Budget";
ENDFIX
FIX([your fix statement], "Actual")
  "Marketing Expenses" = #Missing;
ENDFIX

Does this work? Absolutely. Is it efficient? Not even close. A significant reason for calculation performance is looping through the same block more than once. Sometimes it is necessary, but think of it this way. If you had 50 Excel workbooks and you had to update all 50 of them. Would you open each one and edit one cell, then open them again and update another cell? No. You would open it and update all the cells, close it, and never open it again. You want your calculation to perform the same way. You want it to open a block, edit everything in that block, and never open it again. I understand that is not always possible, so I am not suggesting this is a hard and fast rule.

This calculation, as it is, runs through the blocks 3 times. Each fix represents a and is very inefficient. It also creates blocks regardless of whether Marketing Expenses has data or not.

Don’t use this method!

Using CREATEBLOCKONEQ – Hopefully Nobody Teaches You

Another method is to use SET CREATEBLOCKONEQ in your FIX statement. Using this means we don’t have to use DATACOPY and we don’t have to set the result to null using #Missing. It makes the calculation smaller and would look like this.

FIX([your fix statement], "Budget")
  SET CREATEBLOCKONEQ ON;
  "Marketing Expenses" = "Marketing Expenses"->"Actual" * 1.05;
  SET CREATEBLOCKONEQ OFF;
ENDFIX

This will work for you. It will create the blocks as they are needed. But, it will likely take longer than the original example. Why? Without this setting, the calculation will go through every EXISTING block. It will calculate any block that exists, but will skip the blocks that don’t exist and you will get inconsistent results.

When we set CREATEBLOCKONEQ to ON, it will check every possible block and if it needs to be created it will. The problem with this is that unless your cube is freakishly dense, it will do a lot more than it needs to. Think of putting your car in 1st gear and driving on the interstate for 10 miles. You will go extremely slow and use an enormous amount of gas compared to driving it 5th gear, in which you would get there faster and use very little gas. That is what CREATEBLOCKONEQ does.

Look at your cube statistics and check out the difference between the number of blocks and the number of possible blocks. This runs on all the possible blocks and is likely millions, if not billions of blocks that have no data.

Since the performance on this is really bad, it is rarely used. That said, if you need to run a calculation on a specific combination, like 1 block, or 10 blocks, using this is an easy way to minimize your effort and the additional time by using this is not material, and the blocks have to be created anyway. This would be like putting the car in 1st gear and driving up your driveway. Not super- efficient, but it is only 5 seconds, so who cares.

An Alternate Approach Nobody Talks About

First, will this work in every situation? No. Will it work in 99.9% of them? Yep! This example assumed Scenario is a sparse dimension.

We have established that if we fix on blocks that don’t exist, nothing will happen. Is it necessary that we create the blocks first, then calculate them? Absolutely not. We can create the block by calculating it.

Blocks will get created IF the left side of the equation is a sparse member. If you must have a cross dim operation, and the left most member of the cross dim is sparse, it will create the block. In the original example, we wanted to grow Marketing Expenses 5% over the prior year. If we fix on Actual, where all the blocks that we need grow the expense exist, then we set Budget to 105% of Actual, the block will get created. All of the blocks will get created if they don’t already exist. If there is no block for Actual, then Budget will be 0 anyway, so we don’t have to worry about if the right blocks exist for Actual.

FIX([your fix statement], "Actual", "Marketing Expenses")
  "Budget" = "Marketing Expenses" * 1.05;
ENDFIX

If you think about this, we are running through all the Actual blocks, and then we set the Budget to a 5% more than Actual. Rather than fixing on the calculations, or destination, and calculating the destination, we fix on the source, where the blocks exist, and set the destination to a value. Since we have a sparse member on the left side of the equation, and we are FIXing on Actual, where all the blocks are, we will NOT be skipping anything and you should get a total Budget of 5% more than Actual.

Will this create too many blocks? Possibly. We may not have Marketing Expenses in all the existing Actual blocks. We can optimize this by adding a IF to check to see if Marketing Expenses is either a 0 or #Missing. Rather than checking for both, I just add a 0 to it and then check for 0. 0 + #Missing is 0 and it is just a little more efficient to process and write.

FIX([your fix statement], "Actual")
  "Marketing Expenses"(
  IF("Marketing Expenses" + 0 <> 0)
    "Budget"->"Marketing Expenses" = "Marketing Expenses" * 1.05;
  )
  ENDFIX
ENDFIX

To summarize, we

  • FIX on Actual
  • Only calculate Budget if Marketing Expense is not zero and not #Missing
  • Have a sparse member on the left side of the equation
  • Have a sparse member on the left most side of the cross dimensional operator This will likely result in less blocks created as it only creates blocks where Marketing Expenses is #Missing or 0 but will create and calculate any block need to be created at the same time. What is the benefit of this? The calculation is easier to write, it takes less time to write, and it is much more efficient to execute.

Results Of Real-World A Dataset f The calculations are plain Jane. I am not using threading or any other settings to keep it simple and ensure I am comparing apples to apples.

The database I have has more data in the Plan scenario so I am replicating the same logic but using different scenarios. I need set the BlockCreationTest scenario to 5% more than the OEP_Plan scenario, rather than the prior example of setting Budget to 5% more than Actual. Different scenarios but the exact same concept.

DATACOPY Method

The calculation to copy data to create blocks, set destination to missing, then calculate the growth FIXing on the destination, will look like this.

FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0))
  DATACOPY "OEP_Plan" TO "BlockCreationTest";
ENDFIX
FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "BlockCreationTest")
  "Regular_Cases" = #Missing;
ENDFIX
FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "BlockCreationTest")
  "Regular_Cases" = "Regular_Cases"->"OEP_Plan" * 1.05;
ENDFIX

The data copy took 41 seconds. Updating the destination to #Missing took 5 seconds. The calculation took 42 seconds. The total time was 88 seconds and it created 66,724 blocks. This went through the blocks 3 times.

FIX On Source And Calculate Destination Method

This time, I just FIXed on the source, and used the blocks on the source to create and calculate the destination. The calculation looks like this.

FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "OEP_Plan")
  "Regular_Cases"(
  IF( "Regular_Cases" + 0 <> 0)
    "BlockCreationTest"->"Regular_Cases" = "Regular_Cases" * 1.05;
  ENDIF
  ) 
ENDFIX

This created slightly fewer blocks in 63,739. The difference means that Regular Cases didn’t have a value in all the existing source blocks. The calculation took 6 seconds. That is an improvement of over 8,000%! This method created the necessary blocks and calculated the correct values in 6 seconds compared to 89 seconds using the other method. It went through the blocks one time.

Optimized DATACOPY

Just to be fair, I optimized the calculations for the DATACOPY methodology. The example provided in the article that prompted me to write this was inefficient. I wanted to squash any comments that would suggest the DATACOPY is just as fast if it was written efficiently.

FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "Regular_Cases")
  DATACOPY "OEP_Plan" TO "BlockCreationTest";
ENDFIX
FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "BlockCreationTest")
  "Regular_Cases" = #Missing;
ENDFIX
FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "BlockCreationTest")
  "Regular_Cases"(
  IF( "Regular_Cases"->"OEP_Plan" + 0 <> 0)
    "Regular_Cases" = "Regular_Cases"->"OEP_Plan" * 1.05;
  ENDIF
) 
ENDFIX

The results were better. The data copy took 6 seconds. Updating the destination to #Missing stayed the same and completed in 5 seconds. The calculation of BlockCreationTest took 7 seconds.

That is still 18 seconds, which is 2.5 times slower than FIXing on the source and calculating the destination. I don’t know why you would ever need the step that sets everything to #Missing because it would get overwritten with the third step, but would be 0 anyway. Even if that step gets removed, this method is still twice as fast.

CREATEBLOCKONEQ ON Method

Before I get into the results, take a look at these statistics. The FIX statement runs on

  • 13 Sources
  • 23 Departments
  • 2,382 Companies
  • 25,001 Products

13 x 23 x 2,382 x 25,001 is 17,806,162,218 if my math is correct. That is possible blocks that can exist. If you remember, the calculations above created 66,724 and 63,739 blocks for the one without the if and with the if, respectively. If we extrapolate out the results for a calculation that took 6 seconds that iterated through 66,724 blocks (we know this because it created a block for every source block that existed), to run through 17 billion blocks, it will take an estimated 500 hours! Remember, turning this on will go through ever possible block and if it needs to be created it will create it. If not, it won’t. The calculation is as follows.

FIX("FY18", "Regular_Cases", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "BlockCreationTest")
  SET CREATEBLOCKONEQ ON;
  "OEP_Working" = "OEP_Plan" * 1.05;
  SET CREATEBLOCKONEQ OFF;
ENDFIX

The calc had to be adjusted slightly because even with this setting on, it only creates the block if a sparse member is used on the left side of the equation. I moved OEP_Working from the FIX to the left side of the equation and moved BlockCreationTest to the FIX. I did this in good faith because there is ONLY data in Working, but there is data in multiple Scenarios, so this should run faster than if I left it the way it was in the other calculations.

I stopped this calculation after 5 hours and it only created 223 blocks at that time. If I extrapolate that out, it would take 1,500 hours to finish. Even though I only need to create 66k blocks, I have to go through 17B to figure out which ones need to be created, verses fixing on the source and it only running through 66K blocks.

To The Doubters

I get feedback all the time that this method isn’t possible all the time. There are times when it isn’t possible, but in 30 years of doing this, I can think of maybe 5 times I had to work around it. If you want to allocation a number based on history, you fix on the history and set the destination equal to the history * a cross dim to your rate. If you need to allocation based on percentages entered, then you fix on where the percentages are entered and set the destination to the correct value, as the blocks are created when the rate is entered. This doesn’t just apply to scenario to scenario block creation. You may enter a rate at your entity for eliminations, but not at the product level. You still are only going to allocate down to the products that have budget, or history, so you still can fix on where the products have data and use the rate at no product to calculate your numbers.

If you are allocating the data or have a driver that requires calculations to be created, it has to have some driver somewhere that exists at the level you want to allocate FROM, and if you have that, you can use this method.

I Accept Your Challenge

If you have a situation where you are having challenges with this logic and think you have to use DATACOPY, challenge me to come up with a way to do it. I don’t want you to EVER have to use DATACOPY!

A Cautionary Tale

With all this said, is there a drawback? Yes and no. This will always create the needed blocks. Keep in mind when you put a sparse member on the left-hand side and if you don’t have your FIX isolated to only what you need to calculate, you can potentially create a lot more blocks than you want. You will NEVER create a block at every possible block with this method because if you FIX on something, it ONLY calculates where blocks exist.

If you ran the following calculation on an empty database, nothing would happen because there are no blocks that it would execute the calculation on. To prove this, I cleared all the blocks in the Scenario BlockCreationTest and ran the following calculations.

FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "BlockCreationTest")
  "Regular_Cases" = 1;
ENDFIX

This ran in under a second and created no blocks because BlockCreationTest has no blocks.

What Hasn’t Been Discussed

There are two other ways to create blocks.

CREATEBLOCK

First is the @CREATEBLOCK. You can pass a list of members to this to create blocks. I have used this in some situations where I needed to walk balanced to the next year and I wanted to make sure the next year existed. This is just one example. It did add time to the calculation, but wasn’t significant for small datasets. A couple things to keep in mind.

  1. You have to be extremely careful with this because you can blow up your database if used incorrectly.
  2. If you use this within a fix, you only have to pass the member that is different than the block you are on.
  3. In most situations, this isn’t necessary because of the preferred method above.
  4. It will, in most situations, still require an extra pass of the blocks and negatively impact the calculation speed.

Here is an example.

FIX("FY18", "OEP_Working", "USD", "c01",
    @Relative("Source", 0),
    @Relative("Total_Department", 0),
    @Relative("ComputerTech", 0),
    "OEP_Plan")
  "Regular_Cases"(
  IF( "Regular_Cases" + 0 <> 0)
    @CREATEBLOCK("BlockCreationTest");
  ENDIF
  ) 
ENDFIX

Comparing this to the other way to create blocks, this took 40 seconds and created the correct blocks. If you remove the if, however, it will create every possible block combination possible, which is why you have to be extremely careful with this method.

Groovy

The other option is to use Groovy to create the blocks. It would be safer, but probably slower than the @CREATEBLOCK method and more complicated to write and I didn’t even bother to test it.

My Hope

I hope you read this, understand it, and have a bullet proof way to deal with block creation and a more efficient way than you may have ever thought of or been taught. I feel so strongly about this that I am more than happy to have a quick conversation if you are finding it difficult to use it.




Announcing the Essbase Ribbon, Part 2 – Power Adhoc Ribbon

Many of you used the Essbase ribbon I built 10+ years ago. I have been asked by many to build a similar ribbon for Smart View. Today I am introducing the Power Adhoc Ribbon. It has all the bells and whistles of the old ribbon completely re-engineered for Smart View. There is a nominal fee, but you can try it for free. Some of the awesome features are summarized below.

The Power Adhoc Ribbon has it’s own website where you can get updates, download the most recent version, and read more about what the options and future enhancements. Head over to the Power Adhoc website for all the details.

  • Refreshing worksheets is the cornerstone of any adhoc reporting. We put refresh on steroids. You can refresh the worksheet or all the worksheets in the workbook, but we added even more. The ribbon has two other options that let you refresh with suppression on or off, without changing your Smart View settings.
  • We know you change your aliases all the time. Now it sits on the ribbon to toggle it on the fly. Change between no aliases and the default aliases whenever you want, quickly and efficiently.
  • The ribbon also includes quick options to change your aliases between default and none. It doesn’t stop there. You can also toggle your formatting between Excel and Cell Styles.
  • Now, all your most commonly edited worksheet settings are on the ribbon, including zoom, suppression, and retention. When you change worksheets, the ribbon automatically updates to reflect the current worksheet settings.
  • What do you do when you want to distribute those workbooks with Smart View functions? Zap will remove all the functions in your worksheet or workbook with the values they return and leave any Excel formulas.
  • How many times have you wanted to duplicate a worksheet and been frustrated because it doesn’t copy your settings and connection? That is now a thing of the past. The worksheet duplicator will copy any worksheet and include the connection and worksheet settings.
  • The Power Adhoc ribbon adds a complete template management vehicle. Commonly used worksheets can be saved in groupings and reused at a later time by opening them right from the ribbon. Group them by application, purpose, or however you want. Templates can be starter adhocs or standard reports. You manage your templates and how they are organized. Your templates can be renamed, reorganized, or removed at any time.
  • Power Options allows you to show/hide the options you want. You no longer need to see the options that aren’t relevant to you.
  • Because you maintain your subscription, we can easily let you know when updates are released.  As soon as any improvements are available, you will be the first to know.
  • Not everything on the ribbon will be new to you. To reduce the need to toggle between ribbons, the most commonly used options from Smart View are included.
  • To reduce the learning curve, every option has a power tooltip that explains its purpose.
  • By having a subscription, you get all the updates and you decide what enhancements are prioritized. If there is an enhancement you need, ask for it.
  • Many operations, like refreshing and submitting, can sometimes take longer than you expect. Just to make sure you can monitor those requests, we notify you how long it took by updating Excel Status Bar.



Excel Ribbon Part Duex – I got a second wind!

For years, people have asked me to build a new ribbon to supplement Smart View that replicates the functionality I have in my Essbase Ribbon.  Multiple people have hacked into it and made it their own.  One company is selling it.  For that reason, I had zero interest in doing it again.  Time has passed, and here I am, building a ribbon.  It will again have common functions in one place, as well as some additional ones, like



  • Copy a worksheet that includes all the Smart View settings and connection information
  • HS Zapper function that will replace any cell with one of the HS functions with the value it returns for all you folks that use HSGet and its sister and brother functions.
  • Template manager that will allow you to create/manage/update templates, whether they be complicated workbooks, or just starting points for a new adhoc
  • A connection manager that will automatically log you in to an environment
    • Maybe a Shared Connection switcher to easily toggle environments
  • A bulk copy/paste option that to put in the POVs
  • Functions like retrieve with suppression, retrieve without suppression, so you don’t have to constantly change the options

This will not be written as a VBA addin.  It will be a full-blown DLL written in C# to optimize security and performance.  It will come with an installation process that will allow easy installation updates.

If you would like to be a beta tester, I am looking for 20-30 people.  If you are interested, send me a message through the contact option at In2hyperion.com.  I will include people up to the a limit to make it somewhat reasonable to manage feedback.  If you want to ensure inclusion, let me know ASAP.  It will be first come, first serve. It will be ready in weeks.

This is what you can expect as a starting point.




Adventures in Groovy – Part 52: And You Thought Essbase Only Stored Numbers

My 20+ years of using Essbase I was told, and had no reason not to believe, only stored numbers. We obviously have lists and with Planning it appears we have text. If you aren’t familiar with how this works, the strings are stored in the Planning repository and the index is stored in Essbase. If you thought the same thing, you were as wrong as I was.



What is NaN

I have been learning and implementing Groovy solutions for 2-3 years now and came across something I now only have never seen, but didn’t think was possible. Java, and therefore Groovy, has a concept of NaN. NaN stands for Not A Number. NaN is the result of mathematical operators that create non numbers. Log, square root, division, and I am sure plenty of other formulas that I learned before I was 15 and long forgot, can result in what Java interprets as non numeric values. The two that I have found are NaN and Infinity. An example of 4/0 would result in NaN. 0/4 would result in Infinity.

NaN in Groovy

Prior to about 2 months ago, I accounted for these scenarios in my logic and never had an issue. Recently, in writing some basic math, like revenue / units, I didn’t account for the possibility that revenue or units would be a zero. If these scenarios are tested in Groovy, errors are encountered and honestly, I thought my logic in a business rule would have produced a divide by 0 error when the denominator was a 0.

java.lang.ArithmeticException: Division by zero
at ConsoleScript2.run(ConsoleScript2:1)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

I thought, like in Essbase, 4/0 would result in a 0. I found out the hard way that is not the case!

Types Of NaNs and Infinities

In my case, I didn’t care of the sub type of Nan or Infinity the results was, just that it happened and I needed to account for it. These can be checked very simply.

double simpleSample = 4/0
if( simpleSample.inNaN() || simpleSample.isInfinite() ) {
  println 'ERROR'
}

Your situation might be different. If it is, these are the types I am aware of that you can check for

//NaN variances
isNaN(nan)
isNaN(zero_div_zero)
isNaN(sqrt_negative);
isNaN(inf_minus_inf);
isNaN(inf_times_zero);
isNaN(quiet_nan1);
isNaN(quiet_nan2);
isNaN(signaling_nan1);
isNaN(signaling_nan2);
isNaN(nan_minus);
isNaN(log_negative);
isNaN(positive_inf);
isNaN(negative_inf);
 // Infinite variances
isInfinite(positive_inf);
isInfinite(negative_inf);

What Do You See In Planning and Essbase

So here is where I really was confused! Everything I I thought I knew was wrong.

Surprise Number One

If either of these conditions occurs, the cell that was calculated in Groovy and stored in Planning/Essbase is actually stored differently. I can’t say for sure what happens on the back end, but when the data is exported, rather than a numeric value, it will export NaN. Yes, you will see something like 10,20,20,NaN,40….

Surprise Number Two

If either of these conditions occurs, the cell that was calculated in Groovy and stored in Planning/Essbase shows a number that makes no sense in a data form when opened in Smart View. A value of 65535 will be displayed. This value can be edited/changed. If it is the source of another member formula or calculation, it will also show a value of 65535.

Surprise Number Three

The same thing is NOT what you see in a data form opened in the UI. In the UI (web version), NaN or Infinity will actually be displayed in the effected cell. This almost makes sense if I didn’t see 65535 in Smart View.

Stop NaNs From Happening

There are probably a million ways to handle this. For what it is worth, I want to share how I handled it and why. First, I created a function in my calculation that accepted one parameter, which was the value in which I was evaluating for Nan or Infinity. Inside this I used an Elvis operator and returned 0 if it was Nan or Infinity, and the value submitted to the function if it was a numeric value. The reason I created a function was because I had more than 30 formulas that I needed to check for this and it was easier to write the code once.

double nanCheck(double input){ (input.isNaN() || input.isInfinite()) ? 0 : input }

// Use Case Example
DataCell rate
DataCell units
operation.grid.dataCellIterator({DataCell cell -> cell.edited}).each{cell->
    rate = cell
    if(rate.accountName == 'Small_Unit_Cost'){
        units = cell.crossDimCell('Small_Units')
        units.data = nanCheck(cell.crossDimCell('Revenue').data / rate.data)
    }
    else if(...)
        {...}
}

That’s A Wrap

it is really important to account for this for obvious reasons. If you are testing for NaN and Infinity, save yourself some trouble and if there is a possibility of it occurring, start doing it now. It is a pain to strip it out afterwards if it gets into a UAT situation or even Production. One last thing. If you are looking at this and thinking, this should really return #Missing. You surely can do that. There are a few changes that have to be made. First, the function can’t be double. Since #Missing is a string, it would need to be a string. The second issue is that you can’t set data, which is a double, to a string. You would have to use formattedValue. The changes would look something like this.

String nanCheck(double input){ (input.isNaN() || input.isInfinite()) ? '#Missing' : input }

// Use Case Example
DataCell rate
DataCell units
operation.grid.dataCellIterator({DataCell cell -> cell.edited}).each{cell->
    rate = cell
    if(rate.accountName == 'Small_Unit_Cost'){
        units = cell.crossDimCell('Small_Units')
        units.formattedValue = nanCheck(cell.crossDimCell('Revenue').data / rate.data)
    }
    else if(...)
        {...}
}



Upgrade Or Downgrade To Or From Hybrid In The Cloud

There are benefits to moving to Hybrid, but there are also some challenges.  The content of this post is not around the pros and cons, but the fact that you can upgrade your environment to use it.  If you find it isn’t for you, you can “downgrade” back to BSO.  The flexibility provides everybody the ability to try it.



Recreate Introduction

EPMAutomate comes with a function that allows the ability to restore an environment to a clean slate.  I don’t think this is new to anybody that has used EPMAutomate or EPM Cloud Planning.  What might be a surprise is that it does more than just reset an environment so you can start over.  It can also:

  1. Change the type of Essbase database to Hybrid or a standard BSO.
  2. Temporarily convert a Planning, Enterprise Planning, Tax Reporting, or Financial Consolidation and Close environment to an Account Reconciliation, Oracle Enterprise Data Management Cloud, or Profitability and Cost Management environment.

Using Recreate

The usage of the Recreate command is as follows, which all options.

 epmautomate recreate [-f] [removeAll=true|false] [EssbaseChange=Upgrade|Downgrade] [TempServiceType=Service_type]
  • -f forces the re-create process to start without user confirmation. If you do not use the -f option, EPM Automate prompts you to confirm your action. Be careful using this option.  If you have a long day and aren’t focused, this can make the day a whole lot worse!
  • removeAll, removes all of the existing snapshots, as well as the content of the inbox and outbox.  The default is false, meaning it retains the snapshots and the content of inbox and outbox and nothing is removed.
  • EssbaseChange upgrades or downgrades the current Essbase version in legacy Oracle Financial Consolidation and Close Cloud, Oracle Enterprise Planning and Budgeting Cloud or Planning and Budgeting Cloud Plus 1 environments.
  • TempServiceType temporarily converts an environment to a different service environment.

Changing Your Essbase Version

To change your environment to BSO from Hybrid

epmautomate recreate EssbaseChange downgrade

To change your environment to Hybrid from BSO

epmautomate recreate EssbaseChange upgrade

Trying A Different Service

There are some details that must be understood to use this option and is dependent on the version of the cloud service you have.  For subscriptions other than EPM Standard Cloud Service and EPM Enterprise Cloud Service, meaning PBCS and EPBCS, you can use this option to convert, temporarily, to

  • Account Reconciliation
  • Oracle Enterprise Data Management Cloud
  • Profitability and Cost Management environment

To use this option to convert your environment to something it wasn’t originally intended for:

epmautomate recreate -f removeAll=true TempServiceType=ARCS

To change your environment back to its original service:

epmautomate recreate

For EPM Standard Cloud Service and EPM Enterprise Cloud Service subscriptions, you can use this option to convert to any supported EPM Cloud service.  EPM Enterprise Cloud Service subscriptions use a common EPM Cloud platform. Initially, you can deploy any supported EPM Cloud business process. 

To switch from a deployed business process to another, you must re-create the environment to delete the current deployment and to bring it back to the original EPM Cloud platform. You then re-create it again as the new service type.

For example, if you created an Account Reconciliation business process but now want to create an Oracle Enterprise Data Management Cloud environment, you must run the re-create command twice.

First, reset the service.

epmautomate recreate -f removeAll=true

Second, change the service type.

epmautomate recreate -f TempServiceType=EDMCS

The acceptable service types, currenty, are

  • ARCS (Account Reconciliation)
  • EDMCS (Oracle Enterprise Data Management Cloud)
  • EPRCS (Narrative Reporting)
  • PCMCS (Profitability and Cost Management)

That’s A Wrap

It is great that Oracle allows us to do these things. We have a ton of flexibility, not normally afforded to us in the cloud, to test and use different core database types.  It also allows those using the old SKU to try the new services, or business processes, to see if they might be something you want to purchase. 

If you want to give Hybrid a try, use your test environment and give it a spin.  If you want to get exposed to one of the other business processes, you now have the ability to see it without jumping through hoops.




Working With Planning Formula Expressions

Most of us know that there is a button in the calc rule editor that allows us the ability to select a smart list and the smart list entry.  It will add something [[smartlist name.smartlist entry]].  If this is new to you, what it does is replace reference the smart list and replaces it with the numeric value that exists in Essbase.  The beauty of this is that it is dynamic, so if the smart list is changed in any way, you don’t have to go into your rules and replace change the index values for the smart list entries to match.  Guess what, there are more!



Well, don’t I feel like the F*@$& idiot, to pull a quote from A Few Good Men.

What Is A Planning Formula Expression

As described above, it is an expression that allows you to get valuable information dynamically about artifact properties in a Planning application.  The following formula expressions currently exist.

  • SmartLists
  • Dimensions
  • Planning User Variables
  • Periods
  • Scenarios
  • Cross-References
  • Workforce Cube Year to Date
  • Get ID for String

SmartLists

I already discussed the Smart List, but here is an example if this is new to you.  The calculation manager syntax is [[SLName.entryname]].

FIX (Mar, Actual, Working, FY15, P_000, "111") 
   "Product Channel" = [[Channel.Retail]] ;
ENDFIX

Which would return something like this.

FIX (Mar, Actual, Working, FY15, P_000, "111") 
   "Product Channel" = 2 ;
ENDFIX

Dimensions

The dimension expressions are not all that useful unless you are building calculations that might go across applications that have different names for the 6 required dimensions, plus currency.  Using the following dimension tags, the customized name will be returned when they are added to the dimension expression. The syntax used for this function is [[Dimension("DIM_NAME_ENTITY")]].

  • DIM_NAME_PERIOD
  • DIM_NAME_YEAR
  • DIM_NAME_ACCOUNT
  • DIM_NAME_ENTITY
  • DIM_NAME_SCENARIO
  • DIM_NAME_VERSION
  • DIM_NAME_CURRENCY

An example would look like this. This runs a calc dim on whatever your account dimension is.

CALC DIM([[Dimension("DIM_NAME_ENTITY")]]);

In this application, Entity is named Entity, so the above script returns:

CALC DIM ("Entity");

If the entity dimension was named Cost Center, it would return:

CALC DIM ("Cost Center");

Planning User Variables

Planning user variables return the user variable’s member.  This can be pretty useful if you have variables that are used to do things like fix on their area of a hierarchy.  These can be gathered through run time prompts if they exist in the POV, but that isn’t always the case.  You might use them to show the products, for example, that are under a user variable, in the rows, in which case without Groovy, it can’t be passed in RTPs. The calculation manager syntax is [[PlanningFunctions.getUserVarValue("xyz")]].

An example where the user variable is used to run a calculation might look like this.

FIX (Feb, Actual, Working, E_000, @RELATIVE([[PlanningFunctions.getUserVarValue("Product View")]],0) )
   Revenue = Units * Cost;
ENDFIX

Period Functions

Period(periodName) returns the specified period. The options for this function are and the calculation manager syntax is [[Period("FIRST_QTR_PERIOD")]].

  • FIRST_QTR_PERIOD
  • SECOND_QTR_PERIOD
  • THIRD_QTR_PERIOD
  • FOURTH_QTR_PERIOD
  • FIRST_PERIOD
  • LAST_PERIOD

This example:

FIX ( Mar, Actual, Working, P_000, "6100", FY15 )
   "120" =[[Period("FIRST_QTR_PERIOD")]];  
ENDFIX

would return a script like this

FIX (Mar, Actual, Working, P_000, "6100", FY15) 
   "120" = "Mar";
ENDFIX

The NumberofPeriodsInYear returns the number of periods in the year and NumberofYears returns the number of years in the application. The calculation manager syntax for this is

  • [[NumberOfPeriodsInYear]]
  • [[NumberOfYears]]

The following example

FIX (Mar, Actual, Working, P_000, "6100", FY15)
   "120"=[[NumberOfPeriodsInYear]];
   "120"=[[NumberOfYears]]; 
ENDFIX

would produce this.

FIX (Mar, Actual, Working, P_000, "6100", FY15)
   "120"=12;
   "120"=9;
ENDFIX

Scenarios

This one is my favorite ones.  I have been using Groovy to get these not knowing they existed.  These allow the reduction of if statements and improved performance.  We can get the open periods.  If this go across years, then my groovy solution probably comes back into play.  For ranges that include one year, or even two could be handled, this offers great functionality.  The options available are

  • Start Year
  • End Year
  • Start Month
  • End Month

The calculation manager syntax is as follows.

  • [[getStartYear(“ScenarioName”)]]
  • [[getEndYear(“ScenarioName”)]]
  • [[getStartMonth(“ScenarioName”)]]
  • [[getEndMonth(“ScenarioName “)]]

A use case would look something like this. Assume{rtpScenario} is a run-time prompt variable of type member with a default value of “Actual”:

FIX({rtpScenario}, [[getStartYear({rtpScenario})]]:[[getEndYear({rtpScenario})]],
   [[getStartMonth({rtpScenario})]]:[[getEndMonth({rtpScenario})]])
      FIX ( Working, P_000, "111")
        "5800" = 5500;
      ENDFIX
ENDFIX  

This would build out the following calculation

FIX ("Actual", "FY10" : "FY18", "Jan" : "Dec") 
    FIX (Working, P_000, "111") 
      "5800" = 5500;
    ENDFIX 
ENDFIX

If your open range consisted of two years, you could do something like this

FIX({rtpScenario},[[getStartYear({rtpScenario})]],[[getStartMonth({rtpScenario})]]:”Dec”)
      FIX ( Working, P_000, "111")
         "5800" = 5500;
      ENDFIX
ENDFIX  
FIX({rtpScenario},[[getEndYear({rtpScenario})]],”Jan:[[getEndMonth({rtpScenario})]]
      FIX ( Working, P_000, "111")
        "5800" = 5500;
      ENDFIX
ENDFIX  

Cross-References

This function comes in a few flavors but does something pretty awesome. How it works might change your naming convention a little, or make it more consistent anyway. What id does is generate a cross dimensional reference to our default members, like No Product. The syntax is CrossRef(accountName, prefix, true) but the last two parameters are optional. If you use CrossRef(“Revenue”), it would produce the following, assuming your 6 required dimensions and a product dimension.

"BegBalance"->"No Scenario"->"No Version"->"No Entity"->"No Product"->"Revneue";

I can change my prefix by adding the second parameter. I don’t like having spaces in my member names, so I would do the above with CrossRef(“Revenue”,”No_”) which would produce

"BegBalance"->"No_Scenario"->"No_Version"->"No_Entity"->"No_Product"->"Revneue";

If I change my syntax to CrossRef(“Revenue”,”No_”,true) I have a cross dim operator for all dimensions Except Period (uses BegBalance), and Currency, but includes year

"BegBalance"->"No_Year"->"No_Scenario"->"No_Version"->"No_Entity"->"No_Product"->"Revneue"; 

Used in a Fix Statement the following example

FIX (Aug, Actual, Working, FY15, P_000, "112")
   "111" = [[CrossRef("5800", "No_", true)]];
ENDFIX

would produce the following script.

FIX (Aug, Actual, Working, FY15, P_000, "112") 
   "111" = "BegBalance"->"No_Year"->"No_Scenario"->"No_Version"->"No_Entity"->"No_Product"->"5800";
ENDFIX

Workforce Cube Year to Date

If you use workforce, you probably have noticed the members it generates to get the month index for both the calendar and fiscal period. These can be used with this function to build a calendar to date value. The syntax is [[CYTD(memberName)]]. If you have renamed “Cal TP-Index” and “Fiscal TPIndex,”, then you have to specifically name the members in two additional parameters and the syntax is [[CYTD(memberName, calTpIndexName, fiscalTPIndexName)]]. This method is really easy to use and looks like this.

Fix (NOV, Actual, Working, FY15, P_000, "112")
   "5800" = [[CYTD("6100")]];
ENDFIX

If the default names are changed, it would look a little different.

Fix (Dec, Actual, Working, FY15, P_000, "112")
   "5800" = [[CYTD("6100", "Cal TP-Index", "Fiscal TPIndex")]];
ENDFIX

Get ID for String

This doesn’t solve all the problems around Smart Lists and text accounts, but it is a step in the right direction. If you don’t know, both of these are held in the Planning repository. Essbase ONLY stores numbers. Look at this as the index to the value you see in Planning. The repository has the map from index to value. In an Essbase calculation, you can’t set a text account to a text value. Well, actually, you can. The syntax for this function, which assigns a text value, is [[PlanningFunctions.getIdForString("text")]]. This allows you to set the value of a text account to a string.

In Planning, you have an account named “acct1 text” that is of type text. You want to copy your values from FY16 Dec to FY17 Mar, and change the text account to “Not Budgeted,” it would look like this.

FIX (Actual, Working, P_000, "210")
   DATACOPY FY16->Dec TO FY17->Mar;
   Mar(
      "acct1 text"->FY17 = [[PlanningFunctions.getIdForString("Not Budgeted")]];
   )
ENDFIX

That’s A Wrap

One last thing. If you use any of these in a member formula, for some reason you have to remove a bracket on each side. So, instead of two, you just need one. I will say I have not tested all of these, but the ones I have tested/used do follow this pattern. Hopefully Oracle keeps expanding these. Although they aren’t as helpful as they were prior to Groovy, they are simpler to use than implementing a Groovy solutions for some of these needs. For you lifers, it is things like this that a newb tells you. Don’t ever think you can’t learn from somebody that “knows nothing.”




Adventures in Groovy – Part 51: Dynamically Changing Storage Properties When Using Hybrid

With hybrid being used more and more there is a need to manage the storage methods of different levels of sparse dimensions.  Whether it is a staggered hierarchy or not, getting the storage method from the source can sometimes be challenging.  More often times than not, you may want to own it on the Planning side so you can change it at will and not have to go through the typical IT change order process that may take weeks, or even months, to go through the full development cycle.



Managing this manually would not be fun, especially if the hierarchy is loaded more often than monthly.  Yes, you could use the Smart View admin option, but it is manual and let’s face it, you have a ton going on and you will make mistakes. 

In Comes Groovy

With Groovy, a calculation can be written to update metadata.  I have talked about this in several other posts, but I am going to walk through a couple specific examples that are for specific situations.  I think this will spark some interest of taking this further for situation similar, or completely different. 

Reusable Concepts

Before I jump into the situations and examples, there are a couple techniques that will be reused in all the examples.  Rather than repeatedly explain them, let me first introduce them.

First, this situation assumes that the storage methods are different for the plan types.  This might be more unique, but it is easy to deal with.  If this isn’t the case, the properties in the example can be change to “Data Storage”

It is always a good idea to start every Groovy script off with the RTPS tag.  To understand more about why this is important, read Part 49  This will be used in each example.

/*RTPS: */

Each example requires methods that have to have the dababase passed to it.  The easiest way to get the cube the rule runs on is to use rule.cube.  There are other ways to accomplish it, but this is the shortest and most dynamic.

List<Member> products = operation.application.getDimension("Product",rule.cube).getEvaluatedMembers("Descendants(Product)", rule.cube)

Each example gets the dimension and holds it in a variable.  The method requires a pointer to a cube, or cubes.  Often it is easier to pass the cubes in the application, rather than one cube, to make sure all artifacts are available and not hard coded.  operation.application.cubes as Cube[] returns all the plan types as an array of variables that are of type cube.

A note about the parameters that can be used.  It is much faster to use the same parameters that are used in planning, like the options in a data map.  You CAN use most of the Essbase function.  Oracle doesn’t recommend them.  They are slower, but if you are not iterating and running the request numerous times, I haven’t noticed a difference.  In this example, it is executed once, so the performance degradation is minimal.

Dimension objDim = operation.application.getDimension('Product',operation.application.cubes as Cube[])

To get and set the properties of a member, the toMap method is used.  This will return all the properties of the member and I wrote a summary of the use of this method in a prior post found in this post – Part 11 – Accessing Metadata

Map<String,Object> memberProps = it.toMap()

Lastly, if you aren’t familiar with regular expressions, they can be of great use.  I have a module dedicated to this in xxxxx.  I struggled understanding regular expressions for years.  But I promise you, if you take 4 hours and focus on learning them, it will click.  To use it in Groovy, using the matches method allows this.  Briefly, here are some basic concepts.  A ^ means starts with.  A $ means ends with. A dot means any character, and following that with an asterisk means many.  So .* means one to many characters of any type

.matches("^.*Region$") || it.name.matches("^District.*$")

Setting All Parents To Dynamic

If you have a smaller hierarchy, one with maybe only a few levels, it might be advantageous to just set all the parents to dynamic.  The following script iterates through all the product members and sets every parent to dynamic.

/*RTPS: */
List products = operation.application.getDimension("Product",rule.cube).getEvaluatedMembers("Descendants(Product)", rule.cube)
List<Member> lev0Products = operation.application.getDimension("Product",rule.cube).getEvaluatedMembers("ILvl0Descendants(Product)", rule.cube)
Dimension objDim = operation.application.getDimension('Product',operation.application.cubes as Cube[])
products.each{
try{
       Map<String,Object> memberProps = it.toMap()
       if(lev0Products.contains(it)){
           memberProps["Data Storage (${rule.cube})".toString()] = 'never share'
    }
    else{
           memberProps["Data Storage (${rule.cube})".toString()] = 'dynamic calc'
    }
    objDim.saveMember(memberProps)
}
catch(Exception e) {
    println("Exception: ${e}")
    println it.name
}
}

Use Patterns To Set Parent Storage Property

In some situations, there are patterns to the levels of your hierarchy.  Maybe you have regional levels that are definable and unique that can be used to set different levels to dynamic.  Assume the following naming convention for this example

  • Total Products
    • West Region (everything ends in Region)
      • District 1 (everything starts with District)
/*RTPS: */
// Get every product in the hierarchy
List<Member> products = operation.application.getDimension("Product",rule.cube).getEvaluatedMembers("IDescendants(Product)" , rule.cube)
// Assign the producdt dimension to a variable
Dimension objDim =  operation.application.getDimension('Product',operation.application.cubes as Cube[])
// Loop through each product
products.each{
       // if the product matches these expressions, change the app setting to dynamic
       if(it.name.matches("^.*Region$") || it.name.matches("^District.*$") || it.name == 'Total Products'){
              Map<String,Object> memberProps = it.toMap()
        memberProps["Data Storage (${rule.cube})".toString()] = 'dynamic calc'
        objDim.saveMember(memberProps)
       }
       // otherwise change it to never share
    else{
              Map<String,Object> memberProps = it.toMap()
        memberProps["Data Storage (${rule.cube})".toString()] = 'never share'
        objDim.saveMember(memberProps)
    }
}

More Complicated Possibilities

There are a bunch of other possible needs.  Let’s say you have a need to make everything above level 3 dynamic.  First, if the hierarchy is staggered, the same level can be a 1 and 5.  You would have to decide how to handle that.  I would lean toward if it was a level 1 and a 5, I would make it dynamic because that might also mean your level 5 and 9 in that portion of the hierarchy would be a pretty deep hierarchy to make 9 levels dynamic.  Every situation is different, and performance would have to be evaluated, but the complexity of identifying how to set the storage in these situations is what I am trying to explain.

If you want to use patterns, you may also want to ensure that the pattern isn’t replicated at a parent and level 0, so there may be a need to check for both a pattern and the level of the member.

Obviously, there are an infinite amount of possibilities and each one could introduce complexity.  Just understand that almost anything can be defined by patterns and levels and can be accomplished, but the level of complexity of your logic or your regular expression may increase.

That’s A Wrap

The bottom line is that we now have the ability to do a lot of things we relied on the source system to do.  Or, maybe external scripts were run using Perl, or VBScript, or PowerShell.  We can use metadata properties, dynamic levels, any other repeatable pattern that might possibly come up.  It is fast and reliable, and completely in our control so we don’t rely on others when it is broken or needs changed.




Convert a Planning Load File to an Essbase Load File

There are a ton of reasons to convert a planning load file to an Essbase load file.  Maybe you are migrating a file from one environment to another, or simple want to load the file faster, but there are reasons to use the Essbase format.

Oracle is working on an enhancement that should be released in the next month or two that will use a load rule to load data to the app using the Essbase load format, which means the logging will be much improved, it won’t stop at the first failed line, and it will log all the errors, just like the Planning load format.  That is great news for those of us that use the planning format purely for the logging.

Performance

Before I get into the script, I want to touch on the speed of this method.  The file I used, based on a real situation, was over 89 million lines (yes, that is correct, million), and took over 5 hours to load as a Planning file.  It had to be split into three files to be under the 2GB limit, but it successfully loaded.  The file was received late in the morning and had to be loaded before the start of the day, so a 5 to 6-hour processing time was unacceptable.  By the way, yes, the file was sorted appropriately based on the sparse and dense settings.

I was able to build a unix/linux script using awk to convert this file to an Essbase load format and it only took about 9 minutes to convert.  The improved load time was pretty drastic.  It finished in under 15 minutes.

For testing, it was great, and it was perfect to improve the processing until the source system could rebuild the export in the Essbase format.  Just to reiterate, I added less than 10 minutes to convert the file, and reduced the load time by 4.5 hours, so it was worth the effort.

The Catch

Before I continue, if you are unfamiliar as to why the two load formats, here is the difference.  Essbase loads the data directly to Essbase.  The Planning load will bounce the file off the Planning repository to convert any smart list string account to the appropriate number, which is what is stored in Essbase.  This process creates a new file on the server, in an Essbase load format, with the numeric representation of each smart list account.  If you have no smart list conversions, this entire process is done for no reason, which was the case in this situation.  So, this isn’t the answer in every situation.

The Script

Before I get into the script, if you know me you know I love my Mac.  One of the reasons is because I have the performance of a Mac, I can obviously can run Windows whenever I want, and I have the ability to run Bash scripts through the terminal.  I am not a Bash scripting expert, but it is extremely powerful for things like this, and I am learning more as I need to build out functionality.

If you are a Windows user, you can install and use Linux Bash scripting in Windows 10.  You can read about it here.

There are several languages that can be used, but I chose AWK, which is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

First the script.  Here is it.  I put the awk on multiple lines so it was a little more readable, but this is one command.

SOURCEFILE="Data.csv";
LOADFILE="DataLoad.csv";
HEADERMBR=$(head -1 $FILE | cut -d ',' -f2)

awk -v var="$HEADERMBR" 
     'BEGIN {FS=","; OFS="\t"} 
     NR>1 
     {gsub(/"/, ""); 
          print "\""$1"\"", "\""$3"\"", "\""$4"\"", 
                "\""$5"\"", "\""$6"\"", "\""$7"\"", 
                "\""$8"\"", "\""var"\"", $2}' 
     $SOURCEFILE > $LOADFILE;

There are a few things you will need to change to get this to work.  Update the source file and the load file to reflect the file to be converted, and the file name of the converted file, respectfully.  Inside the AWK script, I have 8 fields, 1 through 8.  This represents the 8 columns in my Planning file, or the dimensions and the data.  Your file might have a different count of dimensions.  If your file has more or less delimited columns (ignore the POV field quotes and assume that each delimited field in that is an additional field), update the script as needed

In this example is a planning file example and each arrow represents a field.  The print section of the awk command changes the column order to fit what the Essbase load format requires.

Breaking down AWK

This won’t teach you everything there is to know about AWK, as I am still learning it, but it will explain the pieces used in this command so you can get started.

This piece is simply creating two variables, the source file and the converted file name, so there aren’t multiple places to make these changes when the script needs updated.

SOURCEFILE="Data.csv";
LOADFILE="DataLoad.csv";

The head command in Linux grabs specific lines, and -1 grabs the first line of the file.  I pipe that with the cut command to grab the second field of the header line, which is the dimension member I need to add to every row.  That gets stored in the HEADERMBR variable for later use.

HEADERMBR=$( head -1 $FILE | cut -d ',' -f2)

The example file above is repeated here.  You can see that the second field is the member and HEADERMBR is set to source_SAP.

Now the AWK command.  Before I jump into it, the AWK looks like this.

awk 'script' filenames

And in the script above, the awk script has the following form.

/pattern/ { actions }

You can also think of pattern as special patterns BEGIN and END.  Therefore, we can write an awk command like the following template.

awk '
     BEGIN { actions }
     /pattern/ { actions }
     /pattern/ { actions }
     ……….
     END { actions }
' filenames

There are also a number of parameters that can be set.

This script starts with a variable.  The -v allows me to create a variable.  The first part of this command creates a variable named var and set it equal to the HEADERMBR value.  I have to do this to use the variable in the script section.

-v var="$HEADERMBR"

The BEGIN identifies the delimiter as a comma and sets the output delimiter to a tab.  FS and OFS are short for Field Separator and Outbound Field Separator.

'BEGIN {FS=","; OFS="\t"}

Since the file has a header file, and I don’t want that in my Essbase load file, I only include the lines greater than 1, or skip the first line.  NR>1 accomplishes that.

NR>1

Gsub allows me the ability to create substitutions.  The source file has quotes around the POV field.  AWK ignores the quotes, so it interprets the field with the start quote and the field with the end quote as a field with a quote in it.  These need to be removed, so the gsub replaces a quote with a blank.  The first parameter is a literal quote so it has to be “escaped” with a /.

gsub(/"/, "");

The next piece is rearranging the columns.  I want to have the second column, or the column with the data, at the end.  I have 8 columns, so I put then in the order of 1, skip 2, 3 through 8, then the variable that was created that has the dimension member in the header line, then 2(the data field).  It looks a little clumsy because I append a quote before and after each field, which is required for the Essbase load format.  But, this is just printing out the fields surrounded by quotes (except for field 2, the data field) and separated by columns.

print "\""$1"\"", "\""$3"\"", "\""$4"\"", "\""$5"\"", "\""$6"\"", "\""$7"\"", "\""$8"\"", "\""var"\"", $2

The last piece is identifying the file I want to do all this work to.

$SOURCEFILE

I want to send the results to a file, not the screen, and the > tells the command to send the results to a new file.

> $LOADFILE

The Result

The outcome is a file that is slightly larger due to the additional quotes and replicating the member from the header in every row, normalizing the file.  It is converted to a tab delimited file rather than a comma delimited file.  The header is removed.  The app name is removed.  And the columns are slightly different as the data column was moved to the end.

That’s A Wrap

I am not ashamed to say this simple, basically one line script, took me forever to build and get to work.  By forever, I don’t mean days, but definitely hours.  That is part of the learning process though, right?  It was still quicker than waiting 6 hours for the file to load!  So now you have basically a one line awk command that converts a Planning load file (or an export from Planning) to an Essbase load file and you can get home to have dinner.