TiRex - a foundation model for timeseries
Shownotes
Huggingface: https://huggingface.co/NX-AI/TiRex
Leaderboard: https://huggingface.co/spaces/Salesforce/GIFT-Eval
Do you already know the Rexroth blog
If you have any questions, please contact us: vertrieb@boschrexroth.de
Produced by Bosch Rexroth AG, Sales Europe Centre Susanne Noll
Transkript anzeigen
00:00:00: Hello everybody and welcome to a new episode of our tech podcast.
00:00:08: My name is Robert Weber and today we are talking about Tirex, the first XLSTM time-based time
00:00:14: series foundation model and we were allowed to borrow an episode from the Industrial AI
00:00:20: podcast.
00:00:21: Thanks a lot guys.
00:00:22: That's why Peter Sieberg is hosting this episode, enjoy listening.
00:00:30: Hi there.
00:00:31: Welcome to a new episode of the Industrial AI podcast.
00:00:35: My name is Peter Sieberg and I'm your host.
00:00:38: And today I'm going to be talking to the one and only Zep Hohlreiter.
00:00:42: And Zep and I are going to be talking about Tirex and Tirex is the first XLSTM-based time
00:00:50: series foundation model.
00:00:52: Hi Zep.
00:00:53: Hi.
00:00:54: How are you doing?
00:00:55: I'm fine.
00:00:56: I'm very excited about because we are launching Tirex.
00:01:00: I'm very excited.
00:01:01: Oh, that's great.
00:01:02: We're going to be talking about Tirex in just a minute.
00:01:06: We have had you on our show actually two, three times.
00:01:11: And for other reasons, I believe that at least 95 percent of our listeners will have heard
00:01:17: about you.
00:01:18: So maybe very quickly introduce yourself to our listeners.
00:01:23: Yes.
00:01:24: My name is Zep Hohlreiter.
00:01:27: I'm heading the Institute for Machine Learning here in Linz.
00:01:31: It's a JKU, Johannes Kepler University.
00:01:34: I'm also chief scientist of newly founded AI company called NX AI.
00:01:43: And this company is dedicated to bring AI to industrial applications, to bring AI to
00:01:51: the machinery and to be a focus in at the moment at XLSTM, the new technique.
00:01:59: And I'm known for inventing LSTM.
00:02:01: LSTM stands for Long Short-term Memory.
00:02:04: And LSTM started all this chatbot, chatGPT stuff because the first large language model
00:02:10: was an LSTM model.
00:02:12: And I'm known for LSTM.
00:02:14: Great.
00:02:15: Thank you very much.
00:02:16: Last time that we met was actually in Linz.
00:02:18: You refer to Bose, your company, new company, NX AI, which you are a co-founder of, as
00:02:25: well as where the Johannes Kepler University is.
00:02:29: Yeah.
00:02:30: You already refer to LSTM.
00:02:32: I dare to use the quote, "Great thinkers stand on the shoulders of giants."
00:02:39: And even if it was themselves.
00:02:43: So why, maybe don't you quickly take us by the hand, look back at your, I don't know,
00:02:50: maybe 30, 35 years of AI research, and maybe you want to tell us what were the main stations
00:02:57: that brought you then to XLSTM.
00:03:00: Yes.
00:03:01: I invented LSTM 1991 in my diploma thesis, where I first analyzed the vanishing gradient
00:03:09: which is a common problem in deep learning, which you have to overcome to build large
00:03:15: models.
00:03:16: And I proposed LSTM architecture for the current neural networks, for neural networks which
00:03:22: can process time series, which can process text.
00:03:27: But then neural networks were not popular anymore in the community.
00:03:32: Support vector machines came, even we had problems to publish LSTMs.
00:03:37: And then starting in 2006, deep learning came, starting in 2010, LSTM became very popular,
00:03:47: all text and speech programs on cell phones, where LSTM based has many, many LSTM applications,
00:03:58: the same with Amazon and you name it Microsoft or so.
00:04:02: But then it turned out 2017, there's another technique, it's called transformer, where
00:04:09: the tension mechanism is built in, that these architectures are better in parallelizing.
00:04:18: You can push more data through these models in training than you could in LSTMs.
00:04:27: Even if the first LSTM, the first large language models were based on LSTM, but this parallelization
00:04:34: to get more training data at the same time pushed LSTM from the market and transformer
00:04:40: was used.
00:04:41: And I always thought, hmm, can we not scale up LSTM like transformers?
00:04:49: Can we not do the same?
00:04:50: Can we not build large models?
00:04:52: Can we not make it faster?
00:04:55: And with XLSTM, we achieved this.
00:04:57: We looked into it, we copied some of the tricks of the transformer technology, added some
00:05:05: of our own tricks from the LSTM technique and then published this XLSTM technology,
00:05:14: which is a model, which is based on the original LSTM, but can be paralyzed and has some other
00:05:23: tricks, which make it really, really powerful.
00:05:27: And we showed it can achieve the performance of transformers in large language modeling.
00:05:33: We will show soon that we are on the same level as transformers.
00:05:37: Right.
00:05:38: We may be coming into a little bit more detail later on in this comparison of the transformer
00:05:44: and XLSTM technology, but our topic today is time series.
00:05:49: Now, am I correct in assuming that until recently, with regards to XLSTM, as you just introduced,
00:05:57: you have been concentrating on language?
00:06:00: So how is time series data different from non-time series data?
00:06:07: So the data that does not have any time stamp from the perspective of the researcher Sepolreter?
00:06:14: Yes.
00:06:15: First of all, for me, there's not a big difference.
00:06:18: If you give me a sequence, I can use every time series method because I can assign to
00:06:26: the sequence element time points and I can analyze sequences like DNA or even text also
00:06:33: might be a sequence.
00:06:35: From this, there's not a big difference.
00:06:38: But as the data is different, if you look at text, there are coordination between words
00:06:44: which are far away and this is a more abstract symbols you process.
00:06:52: And in time series, in most cases, you have numerical values, you have numbers or a vector
00:06:58: and you possess this numerical values.
00:07:02: And often in time series, as the data comes out of a complex system, the system has something
00:07:08: like a hidden state.
00:07:10: It's about in what state is the system and then you want to predict the future or you
00:07:17: want to classify what's happening right now.
00:07:20: This is a difference between abstract symbols, which have some meaning and numerical values
00:07:27: which came out of a complex system with hidden states.
00:07:31: Right.
00:07:32: So referring to the systems, maybe you can give us a couple of examples.
00:07:36: Time series are being used in a variety of very different markets.
00:07:42: Maybe you can give us a couple of examples of use cases and markets where the typical
00:07:47: time series data comes from.
00:07:50: Time series pervasive everywhere, you find them everywhere and you encounter them everywhere.
00:07:58: If you think about weather forecasting, if you're driving with your car and the navigator
00:08:04: tells you as estimated the time of arrival, it's a time setting, it's a forecasting.
00:08:09: If your system tells you when it's a battery, if you have an e-car, it's empty, it's a time
00:08:15: series problem.
00:08:17: But it's in a stock market prediction, in predictive maintenance, in logistics you have
00:08:24: to predict when do you have to order new parts, set your production, or when your machinery
00:08:33: needs new oil, you have to predict the market.
00:08:38: For example, if you produce something for the car industry, you have to predict how many
00:08:44: cars will be sold to adjust your production.
00:08:49: Very prominent was Amazon, they have all across time series prediction because they
00:08:57: have to predict two things first, how much a product is bought and also how long does
00:09:05: it take to deliver it.
00:09:07: Because they have some deliverer things in the set, say a better in predicting how well
00:09:12: a product is sold than the producers themselves.
00:09:16: Amazon is one prediction company and the whole business model is built on a prediction.
00:09:22: But you need it for climate, you need it for medicine, there's EEG and EKG, there are so
00:09:30: many predictions, you want to know how the body is responding to treatments or during
00:09:38: a surgery, there's applications in agrar, if you do some corn or apples or whatever,
00:09:47: you have to predict the weather, you have to predict the soil condition.
00:09:53: A very famous application where we are very good is hydrology to predict floodings because
00:10:02: here if it's raining, we have these hidden states, the rain goes into the soil, goes
00:10:09: under underground basins and you have to memorize how full are these basins because
00:10:17: if they are full, the rain directly will go into the river, otherwise the underground
00:10:21: basins will be filled up before the water goes into the rivers.
00:10:26: And this is a very, very prominent example of how we do earth science in climate change,
00:10:32: where you need this forecasting all the time.
00:10:35: You need forecasting in energy, smart grids, you have to predict the weather for solar
00:10:41: energy and you have also to predict the customer behavior, if there's something like a football
00:10:48: game, like Germany is in the final, everybody turns on the TV and puts a beer into the fridge
00:10:56: or whatever, this is a couple of examples, but there's many, many, many more, it's everywhere,
00:11:01: it's really everywhere.
00:11:02: Yeah, really.
00:11:03: We hear you and I'm sure you could go on for a couple of minutes.
00:11:07: Yeah, so very good and you gave a couple of examples of the specific area.
00:11:12: We have a main interest in here in our podcast in the industrial environment.
00:11:17: So since when then, looking back these, whatever, 35 years, since when have you been looking
00:11:24: specifically at time series?
00:11:26: This is from the moment that you came up with LSTM and if that was the case, what were
00:11:32: until then the main algorithmic capabilities will come to the new ones later on, but what
00:11:39: were the standards in the past that were capable of looking into the future of time
00:11:43: series?
00:11:44: Yes, I started in kindergarten, I was always interested to predict the future, but now
00:11:51: getting this LSTM, the first LSTM applications were time series because texts was not available,
00:11:58: we never thought about doing text LSTM and where I come from only time series were in
00:12:05: our mind and LSTM has been designed for time series as an old, original LSTM and it performed
00:12:14: very well.
00:12:15: LSTM is used everywhere, even one guy from Google told me LSTM is still used in Google
00:12:23: to translate because it's faster census transform architecture in inference in applying it,
00:12:29: but LSTM were in many, many industries in many, many broad domains in industry for prediction
00:12:38: and I gave a couple of applications, but there are many more and LSTM was good there.
00:12:45: Alternatively, there were models like ARIMA statistical models, they only do this local
00:12:52: averaging, meaning you make an average over the last values or you calculate a trend or
00:13:00: something like this, this was typically for stock market predictions with traditional
00:13:05: statistical methods and LSTM was better because LSTM could memorize stuff and it could memorize
00:13:13: in what state some system is.
00:13:16: I brought you a hydrology thing here, if it's snowing, so snow does not go to water, so
00:13:22: snow is stored, there's no lying on the soil and if the sun shines, there's no present
00:13:28: water and this is something like storing water, also in the Glacier underground presence.
00:13:34: Some systems also the sea, if there's a storm at the sea, you don't see it, but there's
00:13:40: a hidden state because in the sea, under the water, still a lot of food is in the water
00:13:47: because of the storm before and fish eating this, there's this hidden states everywhere
00:13:53: and these statistical methods were not good to capture the hidden states because they
00:13:58: do it on your averaging, LSTM was very good to capture the hidden states of some systems.
00:14:04: Think about a pipe, you have a water pipe, you open something, water is flowing but on
00:14:10: the other end, it takes a time until the water arrives, but you have to memorize, yes, I opened
00:14:16: the water pipe, I opened and the water is flowing, this is a hidden states.
00:14:22: Very good, now you have come with a new time series foundation model called Tyrex, the
00:14:30: king of time series, I assume that's what you want to convey with that and it's based
00:14:36: on XLSDM, you just introduced XLSDM in the comparison with a transformer, but what are
00:14:42: the main features, what is the USB of Tyrex?
00:14:45: Tyrex indeed, it's the king of time series, it's the king of time series models, first
00:14:53: of all, it's based on XLSDM and I already told you, so original LSTM is very, very good
00:15:01: in time series prediction, now we improved it, but it still kept its super performance
00:15:07: in time series prediction, it's very good, but with all these tricks of the transformers,
00:15:15: it became even more powerful and this is a time series foundation model, what does this
00:15:21: mean?
00:15:22: This is a new kind of time series prediction which come out of this large language models
00:15:30: because of the in-context learning, for large language models you can write something in
00:15:35: the context, give some questions or give some examples and then the large language models
00:15:41: is processing this and gives you an answer, here's the idea is I train a very large model
00:15:47: on many, many different time series and then I give a new time series in context, it's
00:15:53: like a prompt, it's like a question, but in this case only numerical values, it's a
00:15:59: time series and then you say, can you give me the future, can you give me the next time
00:16:04: point or the next 10 time points or can you give me what's happening in 100 time points
00:16:10: and this is the idea of the large language models, they have so much knowledge and this
00:16:18: time series foundation models have so much knowledge about time series, since I don't
00:16:26: have to learn new time series, but I already see patterns I saw in other time series and
00:16:32: if you give a prefixes, a beginning of a time series, for some it's clear, yes, the
00:16:38: future will look like this, here we have a very, very, using this foundation models,
00:16:43: first of all, they allow non-experts to use high-quality time series models, you have
00:16:51: no idea about time series, you put it in context, all your values and you get good prediction,
00:16:56: wow, you don't have to know anything about time series or deep learning, that's the
00:17:02: first big advantage, the second big advantage is, if you don't have enough data, then you
00:17:09: cannot learn a model for your particular domain or time series, but this foundation model
00:17:16: you only give the beginning of your time series and you don't have any data, you don't have
00:17:21: training data, but the model already makes good prediction, Stefan's perfectly suited
00:17:27: for task where not enough data is available, okay, very good, so what about, so this is
00:17:37: like about the quality, maybe the use, we come to that in a moment, at the very end,
00:17:42: we're going to be looking at some, some benchmark numbers, maybe do some comparison as well,
00:17:47: but before then, if you compare, what about the size of the model, what about the speed
00:17:53: of the model in relation to other solutions in the market, okay, I go later to numbers,
00:17:59: but compared to other solutions, I have to mention other solutions, almost other competitors
00:18:05: in this domain, meaning time series foundation models are based on the transformer technology
00:18:12: because it's so popular, it's so successful in large language models in JGPT, I know it,
00:18:20: and they have a problem, they have a problem because they are typically very large, and
00:18:26: they are typically very slow, for example, if you give a time series, as I said in context,
00:18:34: they always for every prediction, they have to go over the whole time series again and
00:18:38: again, they are super slow, what we achieved is two things, first of all, our model is
00:18:45: small, our model has, because it's based on XLSM, a fixed memory, therefore, it's perfectly
00:18:54: suited for embedded systems at edge devices, which transformer cannot do, and we are super
00:19:01: fast, we are super fast, because of two reasons, because we are small, okay, if we are small,
00:19:07: we are faster, because we don't have to do so much computations, but because in inference,
00:19:13: transformers are quadratic in their context length, in the length of the time series you
00:19:19: give in context, and the LSTM is linear, because it only accesses the memory, it's better, it's
00:19:27: faster, it's much faster, it's smaller and faster, and now, the most important thing
00:19:33: is, it's even better in prediction quality, in forecasting quality, because the XLSM we
00:19:41: use is able to do stage tracking, I told you, there's a stage like in hydrology, if you
00:19:49: If you want to predict how much water is in your river,
00:19:52: states. Water is in the snow, water is in the soil, water is in the underground basins,
00:19:58: and you have to keep track of this. You have to memorize it. You have to track out training,
00:20:03: but the water is going into the soil, but it will flow out later. And this is a state,
00:20:08: this is a hidden state of the system. Also in the robotics state would be where's your robot arm.
00:20:15: You can memorize what movements you have done and where your robot arm is located.
00:20:22: And LSTM can do that. But Transformers or these fast models like RWKV or Mamba,
00:20:32: these models which came out cannot do the state tracking, cannot keep track or cannot monitor
00:20:39: in which state your system is. And that's so important. And therefore we are in many times here
00:20:47: is so much better because we can do state tracking. We can memorize in what state a complex model is.
00:20:54: And to come to the competitors, our competitors are something like Kronos from Amazon,
00:21:01: Times FM from Google, Morai from Salesforce, Toto from DataDoc and also Alibaba,
00:21:15: the Chinese company put some new foundation models for time series only a couple of days ago
00:21:23: into the hacking phase leaderboards. And just the big companies, they devoted a big team to
00:21:30: get good models and we are considerably better. We are clearly better than all these methods
00:21:37: because we have an advantage because we can do the state tracking. And it's not only a small
00:21:42: difference, it's a clear difference where we are better. And all these big companies could not
00:21:49: keep up with us because it's a technology. It's our technology. It's a NXAI technology,
00:21:55: a European technology which has beaten everything else. And we are not only better in forecasting,
00:22:04: as already said, we are faster and we are smaller. And this is fantastic. That's unbelievable. We are
00:22:10: better, faster, smaller. And we are so happy. We are so excited that we are clearly in front
00:22:19: compared to these teams of these big companies. That's great. We can really feel your excitement,
00:22:27: Zeb, that is really great. Higher quality, more speed, smaller. What does that mean? You
00:22:32: already refer to edge as a potential, maybe give us a couple of typical use cases where you see
00:22:41: Tyrax to be applied. Tyrax should become a standard. If you do some time series workouts,
00:22:50: then it's a machinery. If you have a small device and you want to know what's happening on your
00:22:57: machine, it would do the better control stuff. You should use this because it's a machinery,
00:23:03: you have to be fast to interfere fast enough. And you have to be small because you kind of
00:23:09: put a big computer besides your machinery. Small and fast is important. And being good
00:23:14: was an advantage. Or in process control, like a digital twin, you have a simulation
00:23:20: and you do prognosis, you do forecasting of your system. Like the heat, is it too hot at some point?
00:23:31: If it's too hot, if the forecasting said it will become too hot, you have industrial process,
00:23:36: you have this small device on the side with Tyrax in it. Tyrax says, hey, stop, it's becoming
00:23:44: too hot. Send you a regular down. Or Tyrax tells you as a catalystator is not well distributed
00:23:51: because of forecasting, I can predict as a distribution of the catalyst or some
00:23:57: chemical material in your process says, hey, we have to change this, give more of it or whatever.
00:24:04: And this is important because this has to be in real time. If you want to steer
00:24:10: the process, if you want to control the process, it has to have real time capacities.
00:24:16: It has to be small because they have to fit into a small device in an embedded device
00:24:21: in your production system. But also Tyrax, you will see it in autonomous driving because
00:24:27: in cars, you have to predict when is the battery empty and there are many prediction things.
00:24:33: You will see it in drones, if you have to predict it. You will see it in all autonomous
00:24:40: systems, especially in autonomous production systems, because Tyrax is good. Tyrax,
00:24:48: I mean, it's a quality of prediction, is small. It fits on small devices and it's super fast.
00:24:54: Yes, that's ideal for industry. Industry should jump on it.
00:24:59: Exactly. And I'm so happy and I'm sure that many listeners are so happy hearing exactly this.
00:25:07: It's almost like as if you have produced, you know, we started working three, four years ago
00:25:12: and now you come with this great solution almost as if it was specifically made for our audience,
00:25:19: so to say. Very good. So you already refer to, it will be telling you, who is the you? I mean,
00:25:26: you refer to the continued state tracking, but also about the context learning specifically.
00:25:33: So what does that mean? Who is going to be the typical user? Is that changing? Is it more the
00:25:39: data scientist type of very knowledgeable person? Or does it mean that you're going to have like
00:25:45: typically the domain expert being capable of using solutions that are going to be based on Tyrax?
00:25:52: That's a good thing because you don't have to be an expert anymore, because you download your Tyrax,
00:25:58: you feed your numbers, your time series into the context and you get a prediction. And the prediction
00:26:05: is as good and in most cases even better than if you would build a model, use also expert knowledge
00:26:15: in time series research and do a prediction. That's super good because now time series prediction
00:26:22: is open for everybody. But even better, even better, assume you are a company and you sell
00:26:29: a device to different customers. Every customer says, can you adjust the device to my needs?
00:26:37: Can you adjust the device to my environment or to my product or whatever? And then you need
00:26:43: somebody who is fine-tuning the time series, a prediction model, or as a forecasting model
00:26:49: for each customer. If you use Tyrax for example, you put Tyrax on it, on the machinery. It goes to
00:26:57: the customer. As the customer starts the machinery, Tyrax will suck in the data from the machinery
00:27:06: and put it in context and is doing prediction. And if the customer has a new product, Tyrax will get
00:27:13: in the data for the new product or the new use of the machinery and can do prediction. If the
00:27:21: machinery is worn out or changes its behavior, Tyrax can get in the actual data and do prediction.
00:27:30: And you sell something and you don't have to care about it because Tyrax can adapt to all changes
00:27:38: because it can automatically load as a time series into the context and track the machinery,
00:27:46: track the use of the machinery. And you don't have to do anything anymore as a company selling
00:27:52: machinery with time series forecasting built-in in the machines you're selling.
00:27:57: That's really great here. It's a direction that I've been looking at and expecting almost like
00:28:04: for quite some time that domain experts are going to be using their data, the data that they
00:28:12: kind of been producing, but were never capable of doing something with themselves, always needed
00:28:18: to go to other people, third parties or in-house. Now, you gave a general example of a company
00:28:27: selling devices. Now, what is going to be the type of Tyrax customer? What kind of product
00:28:35: or service are they going to build on top of Tyrax or are they going to be using Tyrax directly?
00:28:42: And maybe you want to tell us then in relation to that, what is going to be the type of license
00:28:48: that you're going to put Tyrax onto the market? First of all, Tyrax is a base model we will put
00:28:54: on Hackingface to show everybody that we are better, better sensors, Amazon guys, Google guys,
00:29:01: Salesforce guys, data guys, Alibaba guys, you name it, better sensor Americans, sensor Chinese.
00:29:08: So we have to go out. But what we can do then is do fine tuning. So base model can do every time
00:29:17: series. But if you have enough data in one domain, you can a little bit tweak and you always get
00:29:23: better in this specific domain if you adjust it. And there are tricks how to do the fine tuning,
00:29:30: how to adjust it to a specific application, so you get better. So basic model is already better
00:29:39: than specific models used by statistic guys, what is used right now. But you can get even better
00:29:48: if you do fine tuning, fine adjustment if you go into your domain. And this would be customers.
00:29:56: Maybe say we have the space model, but we can adapt it to your use case and you get even better
00:30:02: performance. Perhaps you get even faster, you can address it to your hardware, to your chip,
00:30:10: to your embedded device. And here, other customers will pay us hopefully that we
00:30:18: adapt this super cool model, it's super strong to their hardware, their specific applications.
00:30:26: Very good. Talking about the specific data, I understand. So there's going to be, I don't know,
00:30:34: there's going to be a hydraulic model, there's going to be a whatever type of machine robotic
00:30:39: model, et cetera. Now, the model that you come with, which is already very powerful, that was
00:30:47: based on available public data or maybe also on data from companies that you have been working with
00:30:54: in specific industrial segments. Right now, it's only based on public data. It's important because
00:31:02: otherwise we would have a license problems. It's based on public data. And here, a nice thing is
00:31:08: a couple of days, a new model came out. It's called Toto from DataDark, a big American company.
00:31:16: And they had one trillion internal data, additional to the public data we are using.
00:31:26: And we're still better. That's like a joke because they used internal data to build their model,
00:31:34: additional to the data we have available. Imagine if we would have all the data
00:31:40: the companies have internally, we are beating them. But what model we would build if we would have
00:31:48: also access to this data, it would be unbelievable. And here we hope that we get more data from our
00:31:56: industrial partners to even build on top of this Tirex model, even better, more specific models.
00:32:06: Like multivariate data, we already have ideas how to make it multivariate stuff like this.
00:32:12: But here, for buildings, we need good data. And we are right now collecting data. We are right now
00:32:20: asking different partners, can we collaborate to build even stronger time series models?
00:32:27: But we are so strong already, but we are looking into the future.
00:32:30: You can become even better. And I'm sure that there's going to be hundreds, if not thousands,
00:32:35: of listeners of companies that are going to be very, very much interested in using one way or
00:32:41: the other their data in combination with your Tirex. Okay, let's look a little bit at the numbers.
00:32:48: You refer to two or three competitors, let's say in the market already. Maybe you want to
00:32:54: share with us what are, what is the number one or maybe number two, three time series benchmarks.
00:33:01: And you refer to two, three potential competitors. And maybe you want to tell us then how Tirex is
00:33:09: performing relative to DIN. Yes, it's a little bit complicated because there are some evaluation
00:33:16: measures. And if you're not familiar, there's only numbers for you. Let's say if we go back to the
00:33:24: status we were seeing now there are new submissions, there's one measurement method. It's called CRPS.
00:33:33: It's about probabilistic forecasting where you not only say one point, but you say like an
00:33:40: interval, you know how good it is. And there was, with these numbers, the smaller numbers are better.
00:33:47: The CRONOS had 0.48. CRONOS is from Amazon times FM from Google has 0.46. TAP-PFN as that's a method
00:34:01: from Francote and Freiburgers 4.8. All these methods, foundation methods, there's also Morai
00:34:11: from Salesforce, Salesforce invested a lot into time series. It was about 0.5. And you see,
00:34:21: all lined up at 4.6, 4.7, 4.6, 4.7. And we get into the same measurement, 4.1.
00:34:30: There's a big gap. It's all big companies are competing at the level of 4.6, 4.7. And we,
00:34:39: with our first submission, we got 4.1. You see, it's a gap. Another method, another criteria,
00:34:48: would be the rank. You don't do the evaluation on one time series. You go over many, many time
00:34:54: series. And then you want to see how good you are on what rank you are, what is the average rank.
00:35:01: Perhaps you're one second, then you're a third, then you're a first, and you give us the average
00:35:07: rank. And if you do this average rank on what place you are, we get for Tyrex, we got on average
00:35:16: three over many, many, many methods, also specialized methods. And the next best method,
00:35:23: like time fn, has six on average, is on place six on average. Chronos is on place seven on average.
00:35:31: Moray is also on place seven on average. We are these three. And the next ones are on six.
00:35:37: You see, there's a big gap, whether you measure directly the performance, the prediction performance,
00:35:44: where you rank the method, say, on what place you are, are you a first or second. And then average
00:35:51: hours of places. We are also with a big gap, better than all others. It's so fantastic, we
00:35:57: couldn't believe it, that we are performing so, so good. And the reason for this is the technologies
00:36:05: that you refer to, continued state tracking, context learning, in combination with on top of
00:36:11: xLSTM in relation to all the other ones are transformer based, or is that not necessarily
00:36:17: really clear? All others now are transformer based, because transformers are so popular.
00:36:22: But in industry, in practice, LSTM performed very well. LSTM was always strong in time series,
00:36:29: but more modern were transformer based methods. But this is now in context learning, said you
00:36:36: don't learn, which is known for large language models. And so for everybody jumped onto transformers,
00:36:42: because we know transformers can do this. It was not clear, can LSTM or xLSTM do this,
00:36:49: said xLSTM can do it. It was, for me, it was clear, because we are also in language,
00:36:55: but here it went through the roof with this performance.
00:36:58: Very good. Congratulations. It sounds really, really impressive. Before we're going to close
00:37:06: off, why don't you share with us maybe where your base, whereas your team boasts for NXAI,
00:37:13: as well as your job at the Johannes Kepler University, maybe you're looking for new colleagues,
00:37:19: there's jobs open, maybe, and if so, what should interested people bring?
00:37:24: Yes, indeed, we have jobs open. We are located in Linz, both as a company NXAI is in Linz. Also,
00:37:34: my institute at the university is in Linz. We are looking always for very motivated,
00:37:41: interested researchers, but also developers. It's such an exciting field. Believe me,
00:37:48: if you join us, you will have fun. It is just great to do it and also many success stuff.
00:37:54: What we also offer is a dual system that you can also work from home half of the time or
00:38:02: something like this. This would be a negotiated and you have a very inspiring environment,
00:38:09: many researchers, many new ideas, everything is on fire.
00:38:13: That's amazing. Maybe I'll consider a job with you. No, I will not. But you, dear listener,
00:38:23: I'm sure that there's going to be many, many people and I think the most important thing is you
00:38:28: kind of, but also you are too modest, but we can all feel your excitement. We here, we heard again
00:38:36: today with the great technology coming from you, coming from Linz, from Austria, coming from Europe.
00:38:44: I can only support you and suggest any interested party person listening to be
00:38:52: contacting you. So, Zeb, thank you very, very much again. As I suggested before, it
00:38:58: feels almost like you are now so close to our industry, to our industrial environment here.
00:39:06: We are very, very much looking forward to seeing solutions based on Tirex, the time series
00:39:14: foundation model that is better, smaller and faster. Thank you very much, Zeb, and looking
00:39:19: forward to see you soon in the Alps again. Yes, it was a pleasure and please check out Tirex.
00:39:26: It's rewarding. Thank you, Zeb. Bye-bye. Bye-bye, ciao.
Neuer Kommentar