Friday 30 September 2016

Facebook relationship algorithms and how to use them to hack a relationship

They are watching you... I was listening to this Software Engineering Daily podcast about Facebook Relationship Algorithms and I had this weird idea. The more I was thinking about it, the more realistic it felt (as well as more than a bit creepy). Let me lay it out for you. The podcast is interesting in its own right, so go listen to it, it's instructive.

So they describe these metrics of your Facebook connections that they reached while trying to find an algorithm to detect your romantic relationship. They took a large sample of users that have declared their significant other and tried to find an automated way of predicting that from the other information Facebook had. The first idea was to use connectivity, one often used idea in sociology that the more common friends you have with someone, the closer you are, but it didn't quite work. One clear counterexample would be coworkers connected on social media. Instead, they realized that such functional connections often cluster, so you would have the cluster of coworkers, your family, your club friends, the people who share your hobby, etc. The romantic partner would be connected with many of these friends, but across clusters, in other words you would have many common friends that are not really connected with each other. By using this metric they called dispersion, they would guess with more than 50% accuracy from the first try who your partner is from the list of hundreds of your friends.

And here is my idea: why not reverse engineer it? Imagine you have someone you want to hook up with, but have no idea how to proceed. Maybe asking them on a date is not an option or maybe, like any engineer out there, you want to maximize the chances your experiment would work. Why not find the smallest subset of friends of that person that have the largest value of dispersion? So here is what this "hook me up" algorithm would do:
  1. Collect the list of friends of your target
  2. Find a sample that are well connected to the target, but less connected with each other
  3. Approach each of them and befriend them

The result would be that you would become the "natural" choice for a relationship, by going backwards and reversing the direction of causality. Automatic stalking, courtesy of your friendly neighborhood software developer: Siderite! We live in an age in which information about us can be used or abused in innumerable ways and we become addicted and stuck to this way of relating. It's not a bad thing, but it has its drawbacks. It is good to know of them.

Wednesday 28 September 2016

First Light (The Red #1), by Linda Nagata

book cover First Light is Linda Nagata's first book in The Red series, which follows a military man landing right into the middle of an emergence event. Stuck between his duty as a soldier, his love for his girlfriend and father, the maniacal ambitions of an all powerful defense contractor queen and a mysterious God-like entity which seems to like him, our hero does what he can to survive and do good by his own principles.

At first I thought it was going to be one of those cheap soldiering books. It was short, written by a woman, and frankly I expected a standard pulp fiction "read it on a train" kind of thing. Instead I was blown away by the subtlety with which the characters are being explored and the way the story was constructed. I loved the book and I plan to read all the series. I started reading The Dread Hammer, which is another Nagata book, this time fantasy, but it doesn't even come close to First Light. I may even dislike it.

Anyway, I can't say much about the plot without spoiling it, but I can certainly recommend this book. As I said, it is short enough to read and see if it evokes the same feelings. Instead of hurting it, the female perspective of the author enhances the experience and makes it unique. The technical aspects are spot on and the writing style is fluid and easy to read. Top marks!

Monday 26 September 2016

Switch any site to a dark/light scheme

Update October 2019: a CSS media feature (prefers-color-scheme) can be used in conjunction with this. A recent development, it's a media query that allows a browser to activate CSS code based on the theme set in the operating system. You set your preference in Windows or MacOS or wherever and then sites that use prefers-color-scheme will take advantage of that. Something like this:
@media (prefers-color-scheme: dark) {
  html,img, video, object, [style*=url] {
    -webkit-filter:invert(100%) !important;
    filter:invert(100%) !important;
  }

  body {
    background:black;
  }
}


I am a very light sensitive person. Shine a light in my eyes and you limit my productivity immensely. Not to mention it makes me irritable. Therefore I often have the desire to turn cheerful black on white sites to a dark theme, where the colors are reversed. I am sure other people have the same problem so I thought of building a browser extension to enable a switching button between the two.

The first problem is that I need to interrogate all the elements in a page, including the ones that will be created later. The second is that even so I would have problems determining the dominant color of images. But there is something I can use which makes all of this unnecessary: using the invert CSS filter! Since I already use a browser extension that injects my own styles in any site - it's called Stylish and I highly recommend it - all I have to do is apply a filter on the entire site, right?

Wrong! The problem is that when you invert an entire site, all images on the site get inverted, too. That also includes videos and Flash objects. The worst offenders here are the elements that sport a background image that is declared via CSS, since you can't create a CSS selector for them. I am going to present my partial solution and maybe you can help me find a more elegant or more complete one. Here is a general dark theme stylesheet, without the elements that have a background image declared via CSS (it does include those with a background image declared inline, though):
html,img, video, object, [style*=url] {
-webkit-filter:invert(100%) !important;
filter:invert(100%) !important;
}

body {
background:black;
}

What it does is invert the entire page (html), then reinvert video, img, and object elements, as well as those with "url" in the style attribute. In Chrome, at least, there seems to be a bug in the sense that the backgrounds of the direct child elements are not inverted, which means body, as the first child, needs to have the background set to black specifically. The hack to invert elements with "url" in style is pretty ugly, too.

What I think of a solution is this:
  • inject Javascript to enumerate all elements present and future using document.createTreeWalker and Mutation Observers, check if they have a background image and if so, add a class to them
  • inject the CSS above with an additional rule for the class for the elements with background image

However, this doesn't completely solve the problem. One of the major issues is the inverted colors sometimes look dumb. For example a red background turns cyan, white text on light gray background turns black test on dark gray background, which makes it hard to read. I've tried various other filters, like hue-rotate or contrast, but it doesn't really help. Detecting individual color patterns doesn't really work, as the filter attribute affects an element and all of its children. The CSS above only works because the images are inverted again when the entire page has been inverted.

The good news is that most of the time you may use the CSS above as a template, then add various rules (manually) to fix small issues with colors of backgrounds. Even if I don't package this in an extension, you have the power to create your own themes for various sites. Never again will you be subjects to the tyranny of the happy bright shiny people!

Update: if you have read this incredulously, thinking that I am preaching dark themes on a light themed blog, think again. If you go to Menu in the top-left corner you get the option to switch to a dark theme.

Sunday 25 September 2016

Eye openers from the SQLSaturday meeting

I was glad to attend the 565th SQLSaturday in Bucharest yesterday and, while all presentations were cool, I wanted to share with you some specific points that I found very revealing. Without further ado, here is the list:
  • SQL execution plans are read from right to left - such a simple thing, but I remember when I was trying to read them from left to right and I didn't get anything. In SQL Server Management 2016 you also get a "live" version, which shows you an execution plan while it's executing. Really useful to see where the blocking operations are.
  • Manually control your statistics update - execution plans are calculated based on statistics, but the condition for updating the statistics is to have changes in a number of 20% of the rows plus 500 of any table. This default setting is completely arbitrary and may cause a lot of pain. Not only updating the statistics blocks your table (which means more chances that the table will be locked when it is most used), but sometimes the statistics are not useful. One example are reports which may receive a startdate/enddate range or a count or something like that which makes the number of rows affected vary immensely with different parameters. Use OPTION(RECOMPILE) for that.
  • Look for a difference between estimated and actual rows in a query plan, which leads to tempdb spills, which leads to unwanted IO operations - before a query, an execution plan is created or reused, based on statistics, as I was saying above. Once a plan has been chosen, though, it doesn't change during its execution. Basically what this means is that the structure of the plan remains unchanged between the estimated and actual plan. Also based on the plan, memory is requested and never changed. So if the plan asks for 10KB of memory and you need 1000KB, the rest of 990 will be stored and used from tempdb even if there is enough memory to put them in, since the memory requirements don't change from estimated to actual. The reverse is not much better, since a plan may ask for a lot of memory when it only needs little, thus making everything else on that machine have less available resources.
  • SQL default settings suck - there was an entire presentation about that, it is useful to think a little about it. So many settings are legacy things that make no sense, like the initial database size, the autogrow size, index fill factors, maxdop (degree of parallelism), parallelism threshold, used memory (ironically, using all of it may be hurtful as it takes it away from other processes which leads to using the swap file), etc.
  • Look for hard page faults - this counter is much useful than the soft page faults, which are fixable faults. A hard page fault is indicative of unnecessary IO operations, which are orders of magnitude slower than memory use.

There are a lot more things that I want to explore now, since I participated to the event. You may find the files for the presentations in the same place as the full list of talks at SQLSaturday.

Saturday 24 September 2016

The Dragonbone Chair, by Tad Williams

book cover Tad Williams probably fancies himself as another Tolkien: he writes long decriptions of lands and people and languages, shows us poetry and songs, tells us about the rich history of the land. And all of this while we follow yet another common, but good boy, with a mysterious ancestry, while he and his merry band of helpers fight THE DARK ONE. It's the same old story, with the hapless youth that is guided by wise but not forthcoming people who tend to die, leave or otherwise shut up before the hero gets the whole story and can do anything about it. Regardless, he is young and lucky, so it's OK.

If The Dragonbone Chair would have been fun or interesting or at least show us a character that we could care about, this book would have been readable. As such it's a trope filled, boring and sleep inducing thing. You have to wait until half of the book to see the things that you predicted would happen from the first few chapters. I couldn't even finish it. More than two thirds in the book and there is no significant part of the story that involves either dragons or chairs.

Bottom line: it sucks!

Wednesday 21 September 2016

No one knows how to interview people

After a slightly misused sabbatical year, I went through a period of trying to get hired. That means interview after interview with people that were assessing my fit within their company. Man, that sucked! I mean, I am a white male professional in a business where everyone is looking for personnel and still it was frustrating, demeaning and painful. But I am not here to complain (maybe a little :) ), only to share my experience and my... constructive criticism.

The story


OK, so first off, the only other real experience in looking for work was more than ten years ago and then I was an absolute beginner on the market. However, back then I knew I was a nobody, while now I know that my experience and passion put me way up there as usefulness and value go. I may have started off this campaign with an unhealthy level of smugness, but it goes off quickly, I assure you.

I am lucky that I had this year of experimenting before I started looking, which allowed me to treat it as any other experiment: I accepted almost all interviews and I went diligently through the entire process, no matter my personal opinion about the company. That helped a lot as a learning experience; while I know how to code, it quickly became apparent that I have no idea how to convince others about it. I set up to try everything, learn from it, while continuing to be principled. In my mind that meant completely honest. I didn't expect company people to be honest with me, but that was on them. I would be a perfect WYSIWYG candidate. Better to fail fast, rather than have a miserable experience in the relationship. BTW, that is also my strategy with girls, which explains why I am virgin. I stuck to my guns, though.

The experience


I am not going to name names here. This is not about how awful some companies may have been. It is all about my perception of the hiring process. And it is that it sucks!

I don't know if in other fields it goes smoother, but imagine this: the only people that have any idea about how to hire people are the HR people and they have no idea what software programming is like. I could be married with an HR manager and she still would not know anything about software development. The technical people may know how to code, but they have no idea how to determine if the other person is any good and if you are a technical person you are definitely not an human resources person. I applaud people who can be both, mostly because I have not met one and I can't think of any scenario that would produce such an individual other than some radioactive alien arthropod biting a regular person.

Funny enough, compared with my past, is that HR mostly liked me while the technical people (the majority more than a decade younger) mostly dismissed me. At first I felt like a complete impostor (of course), my self esteem plummeted (which didn't help any), and I was about to beg for a job (which is what most people do, I hear). However, I tried to see the situation from the point of view of the people trying to hire me (I know: shocking!) and I could understand their situation and empathize. Think about it. How would you test someone for a programming job? Who would you call in that meeting? What would be the salary that you would budget for that person (in EUR, after taxes)?

Did you really think about it? Come on, make an effort. I guarantee it's worth it.

OK, so the HR people looked at my resumé and saw that I have had a lot of stable jobs before in all kinds of environments. I was a pleasant enough person (I mean, for a techie, which means without obvious homicidal tendencies) with a very good understanding of the English language. No obvious conflicts, although I may have been too honest in my (err.. constructive!) criticism of past employment. I mean, come on!, every dev can tell you that managers don't know what they're doing, right? If I think about it, the job of the HR department seems fairly simple to me: look for a candidate that fits the profile, lure them in, do the most simplistic psychological screening possible, then pass it along to the tech department. It's something that AIs will probably take over soon. I may fondly look backwards to these times, when there existed people that were actually biased towards me! The typical HR person is a girl. Now, if I am being insensitive here, I apologize, but if you want to seduce a tech to the point where he would do anything to come to you, you use a nice, sexy girl. It's only natural when devs are mostly male virgins. To be honest, these girls could have hated me or wanted me to have sex with them and I would have had no clue whatsoever. If they said it, I took it for granted. If they lied, again, it's on them. If they remained quiet, then I couldn't parse it into text and who's fault is that?

So then there was the tech interview. You have some guy who thinks he's God because he can code and maybe have some overview that is slightly larger than those of his juniors. He is young, probably coming from some technical university (yes, in Romania people actually do look for coding work after studying Computer Science). He has no idea how he needs to conduct an interview, but admitting to that, even to himself, is a bit too discordant with his view of his person. So he does what every tech would do in this situation: he Googles it. You might be amazed, but Google actually turns up some good advice, but you must be willing to admit that your expectations for how to do that may have been completely wrong. So he does the second thing anyone does when Googling: looks for links that validate their own beliefs (and also have some template for interviews that they can quickly print and use).

Am I being unfair here? Probably. But it is a good theory to explain the types of interviews that I had and how they all seemed carbon copied after each other. The template is basically this:
  1. an algorithmic question, such as: how do you refresh a sorted list from another complete sorted list, or how do you intersect two sorted lists, or how do you search into a sorted list or... wait a minute, are they all about bloody sorted lists?
  2. general algorithmic knowledge questions, such as: what is the difference between a list and a linked list, or an array and a list, what are the complexities of operations on lists. Pretty much there has to be a data container there.
  3. general language knowledge questions, such as: behavior of some implementation in a specific language, the results of SQL queries, characteristics of SQL indexes, some HTML stuff, the life cycle of ASP.Net if you are really lucky...
  4. tools and ways of working in tech from previous employers. Here they are actually interested, because while they appear to be judging everything you say, they actually want to hear of better ways of working themselves.
  5. questions about a project you really liked or had a lot of influence over. Yeah! And while you feel like an idiot because no one ever let you work on a project that you think was special, the interviewer learns from your experience and adapts it to their crappy project.
  6. asking you if you have questions for them and looking like they expect you to have some really sensible and relevant questions when all you want is to know if they want to have you or not

The existence of the first step is being fed by sites like HackerRank, Codility and CodingGame, which should never be considered as anything else than learning tools, if not just silly games. However, since these people went through grueling university lectures about algorithms and then inflated their own ego playing on the web sites above they assume you should know about them too. It doesn't matter that they rarely found use of any of it when working on their projects, they just push it under the vague concept of "wanting to see how you think". However, they are not logical problems (like they used to put 5 years ago, copying from Google and the like), they are very specific coding situations. You may know the exact solution - because you played around with algorithms when bored - or you might have no idea how to solve the problem.

And here you are now, facing a guy that looks critically at you, while trying to think of the problem, finding the best solution and doing it before the guy gets bored. It will take him about 60 seconds to get bored, too, as he already knows the answer to the question and he feels it's obvious and the only one possible. Hell, he knew how to solve this before he even left university! It doesn't matter that several scenarios fight for supremacy in your head, that for each there is another solution, that the very simple solution feels too simple and your brain is wracking itself to find another one - that would be probably either wrong, over engineered or both. And you want, you really want to implement a three step algorithm that you know always works (1. Google it 2. Think of something better 3. Use the best implementation found), but you believe it would be perceived as not knowing your stuff. I mean, what if you are at work and your Internet dies? Surely you need to solve the problem anyway, right?

In truth, the lucky scenario is if they send you to HackerRank or something similar to solve a technical test before you meet with anybody. That goes over fast and easy, while you hack comfortably in your underwear and you have no stress about who is thinking what. The unlucky scenario is that you get a guy who thinks you are not a true developer unless you are working on open source projects on GitHub in your spare time. Oh, and they need to be interesting to him.

Yet, after you go through the first two steps, the rest are a breeze: you know your stuff, you know your languages, you can even think of a project or two that had something remotely instructive in them. It feels like you went over a bump, but now you can go full speed. They ask you various things about your past experience, you gladly oblige, make a few jokes, get some laughter, start to feel good about yourself. Surely, you will pass the interview.

And then you get "the call", where you know you have failed from the tone of the HR girl who needs to tell you that they won't be going further with it. You still hope against hope while she goes on and on and on through her complex script of letting you go easy. She thinks she's being thoughtful, yet you are a tech and you want the answer first and the explanations after. And you despair. Obviously, you suck. You will never get a job. You are worthless, less than worthless, a complete buffoon. And here you were, thinking that years of successful work with people that appreciated your efforts meant anything. When was the last time you learned something new anyway? Last month? Three programming languages and five new frameworks launched since then, not to count the new versions of old frameworks that you never got around to master. Who were you kidding? There are ten year olds that can code better than you. And they are not married yet! You know getting a dog will ruin your career. You are a fossil, admit it! In five years you will be begging for food on the street with a sign that says "Will code for bread". And you know what? You are right: you are an idiot!

I cannot claim absolute truth here, because I don't have enough data to arrive at a clear conclusion. That is because when they flunk you, the sweet HR girl stops contacting you altogether. If you are lucky the company didn't use their own human resources and instead you arrived through a dedicated HR firm (headhunters) and they have the decency to not only tell you didn't pass, but also make the effort to tell you why. The people you maybe knew at that company and were really supportive of you joining their team drop from the face of the Earth. Clearly, you were too stupid to work there, so they cut you off. At the very best you are an emotional mess and they don't want to have anything to do with that. So the next section is mostly speculation, but I will try to make it sound good.

The Explanation


There are a zillion reasons people don't want you in their team on the specific project they are working on and that have nothing to do with your value as a human being.

You might have asked for a sum that is too large for what they were prepared to offer. Even if you are that valuable, they are too cheap for it. It's like the girl who dumps you without telling you why (maybe mumbling something about you being insensitive) because you said you liked anal and she was afraid to try it. It may also be because you think you deserve more than people are actually willing to pay in general. That's on you. However, the correlation between your skills and your pay is not linear. It mostly depends on the market. After all, you are trying to sell yourself. You are already a whore, now you are just negotiating on the price. Today you may be a hero, tomorrow you will be the guy that made some money in that [enter fleeting fad] boom and lost it all in the subsequent crash.

People might put a lot of value on algorithms. It may be a good decision, because in their project they often meet situations where good algorithmic knowledge saves the day. If you are not good at it, or you couldn't prove it in the makeshift interview you flunked, they have every right to not go through with it. They might also not know any other way to test your knowledge and be too lazy to actually look for value in people. There a lot of other similar reasons that you may file under this scenario. They wanted something specific from you, didn't tell you and you don't have it.

They think you are old. And you may just well be. Age discrimination aside, why should they hire someone like you when they perceive the same value from a guy 20 years younger - and way cheaper? If I had to chose, I would have no qualms whatsoever. This is also linked to expectations. Remember when you were counseled to try to move to a manager position before you got to [enter ridiculously low number here]? That translates to the expectation that after that age, being a simple techie means there is something wrong with you. This will never ever change. Look at the age average in all the big companies: it's about 30. Startups even less, around 22. That doesn't mean you need to become a manager, I am sure old managers feel just as threatened. Plus, you might really suck as a manager. I know I would. Unofficial sources say that even the places that usually hire people for experience (like government jobs) stop looking at resumes for people over 45. Age does matter, so plan for it.

And then there is the idea that if you are inexperienced you can learn quicker the things that "you really need", like that framework that became famous while you were reading this blog post. You may be experienced, but will they need to fight with you on whether to use ASP.Net MVC over ASP.Net forms? (or is ASP.Net MVC obsolete already? I don't know, I was blogging). I don't know if that's true. I did learn quicker when I was young, but that was mostly by failing miserably again and again. On the other hand a job position where you are hired for your ability to fail your tasks sounds pretty good, doesn't it?

There is also the personal thing. You might have rubbed someone the wrong way. That means not that you are an awful person, but that you just don't match with a person who you might have had to work with if they went forward and hired you. Again, want to be married with the girl that hates you, no matter how big her boobs are? You may be an asshole, but maybe the other person was, too. Some people might feel threatened by you, either because you threatened their life if they don't hire you or because they think you are way sexier than them and you would cock block their attempts to woo the HR girl. You won't become a better person by trying to be liked by everyone. They might have hated your shirt, for example, the one that you thought would look really professional, but they saw as threatening, because they usually work in shorts and t-shirts. Point is, they had some expectations and you didn't meet them. Were they justified? You don't know and you shouldn't care.

The position you would be working on is equally important. You might be a brilliant web developer, but if they actually wanted a server guy, or viceversa, they will drop you. They will not admit that they were not specific in their job description, of course, and instead just blame you for not being "a good fit". Imagine you are a cube that is crying it didn't fit in a circular hole. Ridiculous, right? I mean, would the tears be blocky? As a specific example: I went somewhere for several interviews. I was "a perfect match" for one and not the right person for another. The JD document they sent for both positions was identical.

Solutions


Since I code and have an overview on life, I can definitely tell you how interviews should be conducted, but you will have to buy the paid version of this blog for that. See, I am learning fast, in one phrase I was both a tech, a business person and an asshole.

The truth is that the only way I could think of that wasn't insulting to everyone's intelligence was to actually show them a computer, a real problem, and let them fix it with me watching and helping next to them. And it still wouldn't be sufficient. If it were "real" enough, then it would take time to understand all of the aspects of the problem. The guy might be overly nervous with me next to him. I know people that can only work when no one is watching, and they do great work. Plus, they may have experience on that exact problem and suck at anything else.

Unfortunately, in all my interviews the only tools that I had to work with were pen and paper. Putting aside the fact that I don't even understand my own writing, the last time I had to actually write anything on paper was... oh yeah, the last time I was looking for a job. Does it make sense to conduct software development interviews with no computer? I would say no.

Conclusions


There is a schism between what they expect and what you expect, what you think of yourself and what they do. That is the real reason behind every failed interview. It doesn't really matter if they had unrealistic expectations, but it matters a lot if you had. Like every experiment: acquire data, reason about the data, propose a theory that explains it, test it against new data. The best way to achieve anything is to change your behavior towards the goal. The important thing here is to define the goal. Is it to get hired at any and all costs? Or is it to find the place where you will enjoy working, keep growing and be appreciated for your efforts?

In the end it so happens that not only did I get hired at the company I was aiming for, but I did it on the position I feel I was best suited for, rather than some mediocre second best. Like with dating girls, it is worth waiting for the right one. And with software, you get to do side projects, too!

Tuesday 20 September 2016

The default List implementation in .NET and how much it sucks when you think about it

Update 12 August 2018: I've updates the project to Core 2.1 and changed the Add method to treat OutOfMemoryExceptions in case of memory fragmentation.

I have been looking for a job lately and so I've run into these obnoxious interviews when someone asks you to find the optimal algorithm for some task. And as much as I hate those questions in an interview, they got me thinking about all kinds of normal situations where the implementation is not optimal. I believe it will be hard to find something less optimal than the List<T> implementation in .Net.

Reference


For reference, look at the implementation code in its entirety.

Story


I will be discussing some of the glaring issues with the code, but before that, let's talk about the "business case". You have a C-like programming language, armed with all the default types for such, like bool, int, array, etc. and you need a dynamically sized container, one where you can add, insert and remove elements. The first idea is to use an array, then resize it accordingly. Only you can't know how large the array will need to be and you can't just allocate memory and then resize that allocation, as other variables have already occupied the next blocks of memory. The solution might be to allocate an initial array, then - when its size is no longer sufficient - create a larger one, copy what you need and make the changes in it.

A comment in the source tells us what the developers meant to do: The list is initially empty and has a capacity of zero. Upon adding the first element to the list the capacity is increased to 16, and then increased in multiples of two as required. So, an empty list is just a wrapper for an empty array. A list with one element occupies the size of a 16 element array, but if you add up to 17 elements, the array will double and continue to double.

Implementation


Let's check the algorithmic complexity for the add operation. For the first 16 elements you just add elements in the internal array, n operations for n elements. Once you get to 17, the complexity increases, as you need to copy all previous 16 values first. Now it's 16+16+1, which continues up to 33, where you have 16+16+16+32+1. At 65 it's 16+16+16+32+32+64+1. So while we are adding element after element the operational complexity is on average twice as much as using an array. Meanwhile, the space occupied is half more than you actually need.

Insertion is similarly tricky, but even more inefficient. Basically, when you insert a value or a range, the first operation is EnsureCapacity, which may copy the entire array in a new array. Only afterward the insert algorithm is run and it again copies the part of the array found after the index for the insert.

Removal works in the opposite direction with the caveat that it never decreases the size of the array. If you added 10 million records in your list, then deleted them, your list now contains an internal array that is 10 million elements in size. There is a method called TrimExcess that tries to solve this, but you must call it manually. At least RemoveAll is using an O(n) algorithm instead of calling Remove repeatedly, which would have been a disaster.

The piece of code that sets the internal dimension of the list is actually in the setter of the Capacity property, and it dumbly creates an array and copies the values from the current one to the new one.

A lot of the other operations are implemented by calling the static methods on the Array class: Array.IndexOf, Array.Sort, Array.BinarySearch, Array.Reverse, etc.

The last issue that List has is that, as an array wrapper, it needs contiguous memory space. There will be times where your code will fail not because there is not enough free memory, but because it is fragmented and the runtime cannot find a free block that is large enough for your data.

Better solutions


Before I start spouting all kinds of stupid things, I will direct you to the venerable C5 collection library, which is very well designed, documented and tested. It contains all kinds of containers to optimize whichever scenario you might have been thinking about.

Let's think solutions now. The major problem of this implementation is the need of a continuous array. Why not solve it by adding more arrays instead of replacing the one with another twice as large? When the capacity is exceeded, we create a new array of similar size, then link it to our list. And since we want to have index access, not linked list access, we need to add this array into an array of arrays.

What would that mean? Memory doesn't need to be contiguous. Adding is twice as fast. Inserting is fast, also, as you only need to insert a new array between existing arrays and move around the data in a single inner array. Accessing by index is a bit more complicated, but not by much. Removal is just as simple, with the added bonus that some inner arrays might become empty and be removed directly.

This is by far not "the best" option, but just one level of optimization that tries to fix the biggest single problem with the current implementation. As one of my friends used to say: first collect data about your program, see where the bottlenecks are, then proceed to fix them in decreasing order of importance. The source can be found on GitHub.

Notes


A tester program of sorts is showing the Count, Capacity and time for random operations for a normal list and the FragmentList<T>. The next line shows the individual lengths of the inner arrays. Note that I cheated by using Lists instead of arrays. It only makes sense, as the simplistic operations of List<T> now have a limited negative impact on individual fragments. Take note of AutoDefragmentThreshold, which is 0.8 by default. It replaces all the internal lists with a single contiguous one when there are more than 80% of internal lists that are smaller than a tenth of the total count. To disable the feature you need to set the value to more than 1, not 0. I also implemented all the public methods of List<T>, although you might only need to implement IList<T> and the *Range methods.

Well, enjoy! Hope it helps.

Did you know about StringSplitOptions?

Today I've learned something old. Yeah, it has been there from .NET 2.0 and probably at one time or another I knew about it, then I just forgot about it. I am talking about the string.Split method overload that accepts a StringSplitOptions enumeration. In fact, it is an enum just in name, because it has only two possible values: None and RemoveEmptyEntries.

I had forgotten that I can eliminate empty entries like that, even if, truth be said, string splitting with a regular expression and maybe then filtering the results with LInQ feels much better, both as readability and control over the result. So, what is your preferred method of splitting, say..., a sentence into words?
text.Split(" \t".ToCharArray(),StringSplitOptions.RemoveEmptyEntries);
or
Regex.Split(text,@"[ \t]+").Where(s=>!String.IsEmptyOrWhiteSpace(s));
? (I realize that in the case I am describing I don't need the LInQ at the end except in fringe cases, but it was an example)

Monday 19 September 2016

Why isn't Jupiter the planet with the hottest surface in the Solar System?

There was this quiz from the Planetary Society where Robert Picardo was interviewing people at a sci-fi convention and asking them what is the planet with the hottest surface in the solar system. The expected answer was Venus. Yet, if you think a little bit, Jupiter probably has a solid core, gases made liquid by high pressure and the temperature at the boundary between the gaseous and liquid layers is immense. Shouldn't Jupiter be the answer?

No. And that is because of the definition of a planetary surface. Quoting from Wikipedia: Most bodies more massive than Super-Earths, including stars and gas giants, as well as smaller gas dwarfs, transition contiguously between phases including gas, liquid and solids. As such they are generally regarded as lacking surfaces.

So there you have it: Jupiter loses due to a technicality... just like Pluto! Just kidding. I was cheating, as there is no clear boundary between the gaseous and liquid states inside Jupiter. It is hard to imagine that, since we have a clear image of what a liquid and a gas should look and behave like; it's like boiling water in a kettle and not being sure where the steam ends and the water begins. Jupiter, though, is a big kettle.

Monday 12 September 2016

BitTorrent and Avast [Win32:Evo-Gen (Susp)]

Today the free antivirus Avast reported my BitTorrent installation as infected with Win32:Evo-Gen (Susp). It also promptly removed the executable of the program. I tried to reinstall it, also to have the same happen to the installation program. I've reported a false positive (I hope it is) for BitTorrent Stable (7.9.8 build 42577) and then I added *bittorrent* as an exclusion pattern in Avast. I could then reinstall the program, retaining all of the settings and torrents in the download list.

Sunday 11 September 2016

The Alcatraz series, by Brandon Sanderson (Books 1-4)

four book covers I got this bundle of four books in the Alcatraz series, by Brandon Sanderson, and since I loved all of his books so far, I enthusiastically started to read them. First, you need to understand that there are five books in the series, with The Dark Talent just published. I think it pays to wait until you have it before you read the whole thing, since it pretty much reads as one long story (unless you read the beginning and then immediately skip to the end of each book, heh heh heh!). Second, you need to know that at the beginning the writing will appear waaaay too silly and under par compared to the author's other works. After a while it kind of grew on me, but be aware that it is written mostly for kids. Sanderson even says so in the story itself.

So at first I kind of thought this will be the series that breaks the rule, the one that I would not enjoy reading, and it took some time to shake off this feeling. It feels like the author could not make up his mind on what to write: the obligatory book about writers (after all, the rule says "write what you know", so in the end it's inevitable) or the recently obligatory Harry Potter spoof (which fantasy authors are peer pressured into writing). In the end he wrote something that has both: a story about a boy discovering he has magical powers and also a book filled with meta comments and breaks in the Fourth Wall (the character has the Breaking Talent, see) that shows some of the tools and processes in the writing business.

In fact, the more I read, the more I enjoyed the books. The characters are as always incredibly (annoyingly) positive and there is that Sanderson smartness behind even the silliest of exchanges. He references future scenes, with book and page number (which is amazing if you think about it), he hooks you to a scene then berates himself on using hooks in the book, he uses really silly and out of context details only to use them at full effectiveness a book later. In the end, you get something that children will undoubtedly read with giggling pleasure and that adults (especially those interested in writing) will see as a deconstruction of the writing process.

Now, I still feel the Alcatraz series is one of the lesser Brandon Sanderson books. The silliness sometimes feels forced and the way he writes each book changes slightly, as he experiments with the crazy shenanigans that he started with this series. It's still very entertaining, though. Give it a try, or give it to your children so they can get into writing themselves and make your life a living hell when they grow up :)

Friday 9 September 2016

The disaster of 9/10

Dinosaur roaring over background of burning buildings I was in some sort of American campus, one of those where they found a gimmick to show how cool they are. In this case it was reptiles. Alligators were moving on the side of the street, right next to people. They would hiss or even try to bite if you got close enough. I was moving towards the exit, which was close to the sea and leading into a sort of beach, and I was wondering what would happen if one of these large three meter animals would bite someone. And suddenly something happened. Something large, much larger than a crocodile, came out of the water and snapped a fully grown adult alligator like a stork snatches frogs. What was that thing? A family of three was watching, fascinated, moving closer to see what was going on. "Are these people stupid?" I thought.

Sure enough, the dinosaur thing bites through the little kid. The father jumps to help but is completely ignored, his sudden pain and anguish irrelevant to the chomping reptile. Other tourists were being attacked. One in particular drew my eyes: he had one of the things pulling with its teeth on his backpack. The man acted like something annoyed him, making repeated "Ugh!" sounds while trying to climb back on the walkway. His demeanor indicated that he didn't care at all of the reptiles, temporary annoyances that stopped him from returning to normal life. At one moment he paused to scratch an itch on his face. His fingers were bitten clean off, blood gurgling from the stumps. He was in shock.

Somehow, a famous actor jumped from somewhere and made it clear that it had been just a show, although the realism of it was so extreme that I felt an immediate cognitive dissonance and the scene switched.

I was at the villa of a very rich friend and there was wind outside. Really strong wind. Nearby wooden booths that were at the edge of his garden were pushed towards me, threatening to crush me against the wall of the building. I went inside, telling my friend that he should take whatever he needs from outside because the wind is going to tear them away. He calmly pressed a button on a remote and an inflatable wall grew around the compound, holding the wind at bay. Cool trick, I thought, trying to figure out if it was firmly anchored to the ground by wires or it got all its structural strength from the material it was made of. It had to have some sort of Kevlar-like component, otherwise it would have been easy to puncture. I calculated the cost of such a feature as astronomical.

We went in, climbing to the second or third floor. Out the window I saw buildings, like near the center of Bucharest. The wind was wreaking havoc on whatever was not firmly fixed. The building across the street was 10 floors tall and apparently someone was doing roofing work when the wind started. The fire used to heat up the tar was now really intense from all the fresh oxygen and smoke was billowing strongly. Suddenly the top floor caught fire, flames stoked by the wind into unstoppable blazes. I called my friend to the window and showed him what was going on. While we watched, floor after floor were engulfed in flames, explosions starting to be heard from within. The building caught fire like a cigarette smoked in haste. I thought that if the wind went through the building, via broken windows and walls, then the a very high temperature could be reached. And as I was thinking that, the building went down. It didn't collapse, instead it leaned towards our villa and crashed right next to it.

I panicked. I knew that on 9/11 the towers collapsed because of the intense heat, but a nearby building was structurally weakened by the towers falling and it too went down eventually. I ran towards the ground floor, jumping over stairs. If the villa would crumble, I did not want to get stuck inside. While fleeing, I was considering my options. The building that went down had people in it, maybe I should head right there and help people get out. Then I also remembered the thousands of people helping after the September 11 that later got lung cancer from all the toxic stuff they put into buildings. Besides, what do I know about rescuing people from rubble? I would probably walk on their heads and crash more stuff on top of the survivors. I've decided against it, feeling like an awful human being that is smart enough to rationalize anything.

As I got down I noticed too things. First there was a large suffocating quantity of white big grained dust being blown by the wind, making it hard to breath. The light was filtered through this dust, giving it a surreal dusky quality. I used my shirt to create a makeshift breathing mask for the dust. Then there were the people. A sexy young woman, short skirt, skimpy blouse, long hair, the type that you see prowling the city centers, was now crawling on the ground, left leg sectioned just under the knee, but not bleeding, instead ending with a carbonized stump, like a lighted match that you put off before it disintegrated. I saw people with their skins completely burned off, moving slowly through the rubble, reaching for me. Some where nothing more than bloodied skeletons, some were partially covered in tar, the contrast of red inflamed burns and black tar or burned clothing evoking demonic visions of hell. These people were crying in pain and coming towards me from all sides, climbing over the ruins of buildings and cars. The smell was awful. I ran, parkour style.

As I woke up I tried to remember as much of the dream as possible. I still have no idea what I was doing in that campus and how I had gotten there. I was fascinated by the scene with the shocked tourist, as it implies knowledge of human behavior that I don't normally possess when awake. It was completely believable, yet at the same time bizarre and unexpected. Great scene! I've had disaster dreams before, top quality special effects and all, but never was I confronted with the reality of the aftermath (I usually died myself in those). I am certain I made logical decisions in the situation, but at the same time I felt ashamed at the powerlessness, fear and the fact that I was running away. Should I have stayed and helped (and gotten cancer)? Should I have rushed home and blogged about it? Maybe a strategic retreat in which I would have conferred with experts and then maybe got back with professional reinforcements?

What felt new was that while having all those random thoughts and running, I wasn't really thinking of my life. I wasn't considering whether my life is worth saving or what the purpose of my running actually is. It wasn't thinking at all; just running. It was visceral, like my body was on autopilot, taking that stupid head away from the danger before he thinks itself to death. Considering it was a virtual body in my actual head, that's saying something. I am not just sure what.

Anyway, beats the crap out of the one with being late for the exam, not having studied anything...

Wednesday 7 September 2016

On Copyright

Copyright is something that sounds, err... right. It's about keeping the profits of the sale of work to its author. If I write a book, I expect to get the money from it, not somebody who would print it and sell it in my name. With the digital revolution, copy costs collapsed to zero: anyone can copy anything and spread it globally for free. So what does that mean for me, the author? Well, it sucks. The logical thing for me is to try to fight for my rights, turn to the law which is supposed to protect me, find technical ways of making copying my work harder, preferably impossible, going for the people who are chipping away at my hard earned profits.

However, that only makes sense from the author's point of view, specific to their work, outside of any context. Now, imagine a system that allows me to anonymously share anything to anyone, while they also can get it with no chance of being identified or even detected. Me, the author, can see that my work is being circulated, but I can't stop it or determine who is spreading it. My only recourse is to try to dismantle this devilish system that destroys my living, with the full support of law and various companies that want to get their share of the profits. And here comes the context: in order for me (and the entire media industry) to be able to protect my way of life I need to actively fight against any method that allows anonymous and/or secret sharing of information. In fact, I would be fighting for a global system of surveillance and censorship.

As you have seen, I tried to present the situation from the standpoint of the author. I feel for the poor guy (less for the snakes that get most of the money by controlling distribution and marketing), however it is clear that I can't be on his side. The hypothetical system that I have been describing kind of exists right now in various forms and media companies are making efforts to thwart attempts to make it more user friendly and more widespread. What we see today is the logical continuation of the situation. Forget intelligence companies looking for terrorist threats. Forget tyrannical states trying to find dissidents. Forget the political correctness police trying to expose and shame people for their beliefs. The real money is in stopping people distributing knowledge for free and it has direct (and dire) implications on our ability to speak freely.

Note: I used the expression "speak freely" rather than free speech, because free speech is not actually free at all. Read about it and you will see that is a completely different thing than the ability and permission to express yourself without repercussions.

Of course, I am not the first one to have thought of that. In the world it often has become more effective to use copyright in order to censor things you don't agree with. Just google it and see. Every time you hear something about economic accords, freedom of information, net neutrality, they are all talking about the same thing, only from different angles. The war is active, in full swing, with all of us possible casualties, yet few people are even aware of it. Oh, I am not saying that you can do anything about it... or can you?... but at least you should know about it.

Saturday 3 September 2016

The demise of the personal technical blog

No, it's not about mine, although this blog has had its ups and down. What I want to talk about is the list of blogs I am following and how it (d)evolved.

When I was an enthusiastic beginner in software development I was hunting for interesting blogs that would give me valuable insights into the minds of good developers, the quirks of frameworks, the hidden tools and processes that would make my life better. I was adding blog after blog to my RSS list. Later on, I kind of stopped. I had things to do, work to be done and unfortunately went through some jobs that were not conducive to learning. Perhaps seeing myself as an expert also hindered enthusiasm in learning (note to self: don't do that!). The obsolescence of the tool I was using to read RSS with and the death of Google Reader also did not help. So recently I just went back to that list of blogs and started organizing it with a new tool. I use Feedly now, in case you were wondering.

Today I had an epiphany. I have over 150 blogs that I am "following", 100 of which are software related, yet only very few of them are actually spewing content anymore. In my three year hiatus from blog reading most of the technical blogs just ... stopped. Some of them just plain vanished, complete with content that I had linked to in my own articles. At that time I was considering blogs as permanent as you can get. I mean people just write stuff for the heck of it, so others can read and learn. There would be no reason for any of this to disappear - there are still pages from 1990 active on the Internet, for crying out loud! So what happened?

One theory is that blogs were created as representations of a person's evolution. For example you are a good WPF programmer and you create something like Dr. WPF's blog. When you stop doing WPF (because Microsoft dropped the ball with it!) you stop writing. Perhaps the author still blogs in other places, other blogs that are thematic, I don't know. Another theory is that people just blog at a certain stage in their life; it's like a quarter-life crisis. When they mature, people stop blogging (which says something...). Maybe the social media explosion pushed people away from personalized platforms and they do all their publishing on Twitter, Facebook, LinkedIn, Medium and so on. As the IT industry moves at an ever increasing pace, the blogs may turn into antiquated relics that are obsolete by the time several posts have been published.

I feel sad either way. When I started blogging, people would come to me for help. After all I started the site years before StackOverflow arrived on the scene. I would write about programming, books, anime, life, personal ideas, jokes, space, science, rants, whatever. It happened several times that I was looking for a solution to a problem and found myself explaining it in an older post. People still praise some posts because they refer technology that is maybe a decade old. Others for getting the full picture on how I got to the end result. So for all these vanishing blogs, I feel a sense of loss for all the knowledge that was lost, for all the voices that turned silent.

I know that as a blog dies ten others appear, but there is no sense of origin anymore, no chronological timeline of the evolution of the person writing. I can even go down the "they are not making them like they used to" road. For me a blog would have functioned as a sort of resumé of someones's work. If I liked an article, I would look at others, maybe subscribe. This way I would be connected not with a concept, but with a person: as they grew, I grew. And SEO be damned, I don't care people don't discover my blog anymore because Google can't make up its mind on what I am actually writing about. When people do come, they see me, not just disparate out of context solutions to their 5 minute problems.

So I wrote this article to express my sorrow. I guess that I miss my friends, even if they never knew me.

Thursday 1 September 2016

The Emperor's Soul, by Brandon Sanderson

book cover The Emperor's Soul is set in the same world as Elantris, but for all intents and purposes it is a standalone story. It's not a full size book, but it's a bit longer than a short story. In it we find a type of magic called Forging, by which someone can carve and use a complicated seal to change the history of an object or, indeed, a soul. The forging needs to be very precise and as close as possible to the actual history of the target, otherwise the magic doesn't "stick". So what must a brilliant forger do in order to do the forbidden soul forging? They must know their target so intimately that they can't ultimately hurt them.
An intriguing idea, yet Brandon Sanderson does what he does best and focuses on the characters, while the story itself reveals the underlying motivation of each and how it affects the final outcome. The descriptions are minimal and the ending is optimistic and personal as Sanderson often likes his finales.
Since the story is short and I had no trouble reading it in a single evening, it is almost a pity not to read it, even if in itself is just a tiny "what if" tale and offers nothing spectacular.