Tuesday 23 March 2021

The closures trap.

Or Another Case of Shooting Myself in the Foot.


I was writing some function that uses e*cel spreadsheet import library, with the aim that its input (about 50 cells) could be either wrapped (human-readable version, 6 rows × 11 columns) or flat (human-hardly-readable version, perhaps more convenient when the spreadsheet is made by some other program.

In order not to copy too much stuff, I decided to make some cell shifting functions. For some reason my first thought was to create a list of void functions that could change pairs of numbers (rows and columns) and store them in a list. Being aware that 

Action<int, int> shift = (row, column) => 
    row += rowShift; 
    column += columnShift;
};

does nothing, I gave Action<ref int, ref int> a short try, which gave me some red underline. A no-go. Tough. A quick thought suggested using some wrapping class like

public class Coordinates
{
    public int row { get; set; }
    public int column { get; set; }
}

and 

Action<Coordinates> shift = (c) => 
{
    c.row += rowShift;
    c.column += columnShift;
};

That's it, and assuming I'd have such shifting delegates in a list, the final step would be to create some wrapping function like

private void ShiftCells(int whichShift, ref int row, ref int column)
{
    Coordinates c = new Coordinates() 
    
        row = row, 
        column = column
    };
    
    shifts[whichShift](c);
    row = c.row;
    column = c.column;
}

and use it in functions that read values from spreadsheet cells, instead of direct operations on row and column variables. Yes, I know, this is—in a way—ugly, but the whole environment is much uglier than this and part of my job is to fix this or that and stay more or less sane and keep my code cleaner than other peoples' pieces. Which is not difficult in this case, and I'm being very polite here.

In order to keep everything clear, I decided to put those row- and columnShifts in some array, like

private readonly int[,] wrappedShifts = { {1,3}, {2,-1}, {3, 0}, ... };

and of course

private readonly int[,] flatShifts = { {0,1}, {0,1}, {0,1}, ... };

Then, depending on what kind of spreadsheet is chosen by the user and appears as my input, I planned to use the apropriate set of shifts. Of course, the second set is rather obvious—just column++; - but I prefer that to not-self-documented jumps between cells, all inside if/else things...

The nightmare started right away after creating Action delegates in a loop:

for(int i = 0; i < wrappedShifts.GetLength(0); i++)
{
    shifts.Add(
              (c) => 
                    {
                        c.row += wrappedShifts[i,0];
                        c.column +=  wrappedShifts[i,1];
                    }
              );
}

I was getting some nonsense shifts, and later some errors occured suggesting that the above loop had iterated one more time than it should have had. Every time an "index out of range" exception popped up, I thought there is something wrong with the spreadsheet. Or the shifts. Or the way I was reading data from similar rows etc.

The solution to this bad riddle turned out to be: every delegate in the shifts list remembers a reference to i, not its copy. The last value of i is 4, when the loop exits. So, each of the four delegates in my list was the same, and i == 4 of course is out of bound of the wrappedShifts[,] first index. The keywords, according to Wise People, are: closures and capturing.

A workaround would be to say: int idx = i; and store e.g. wrappedShifts[idx,0] instead of wrappedShifts[i,0]. Or... use a foreach loop.

What puzzles me the most is, that the loop variable i is still alive after the loop finishes, just because it's been captured by the delegate. I accept easily that lambdas "see" variables that are in the same scope; it seems intuitive, but don't make use of this feature. I know the difference between

int i; for(i = 0; i < 10; i++) { ... }

and 

for(int i = 0; i < 10; i++) { ... }

and I'm too used to the idea that in the second case the index variable dies immediately after the loop exits.

Second—capturing and closures seem rather bad for code readability as I understand it, being a beginner. ("It could be that the purpose of your life is to serve as a warning to others"...)

No comments:

Post a Comment