When does class attribute initialization code run in python?
There is a
spark in our
class AnalyticsWriter: spark = SparkSession.getActiveSession() # this is not getting executed
I noticed that this code is not being executed before a certain class method is run. Note: it has been verified that there is already an active
SparkSession available in the process: so the init code is simply not being executed
@classmethod def measure_upsert( cls ) -> DeltaTable: assert AnalyticsWriter.spark, "AnalyticsWriter requires an active SparkSession"
I come from jvm-land (java/scala) and in those places the class level initialization code happens before any method invocations. What is the equivalent in python?
Solution – 1
Class attributes are initialized at the moment they are hit, during class definition, so the line containing the
getActiveSession() call is run before the class is even fully defined.
class AnalyticsWriter: spark = SparkSession.getActiveSession() # The code has been run here # ... other definitions that occur after spark exists ... # class is complete here
I suspect the code is doing something, just not what you expect. You can confirm that it is in fact run with a cheesy hack like:
class AnalyticsWriter: spark = (SparkSession.getActiveSession(), print("getActiveSession called", flush=True))
which just makes a
tuple of the result of your call and an eager